Member
Member
invarbrass   07-06-2008, 05:40
#1

Hi,

I am currently building a large gallery. But I'm yet to put it on a production environment, since I'm concerned about scalability issues.

As of now, the gallery has > 2000 albums/folders. When the site grows, I think we may run into some performance issues. Thousands of directories and files in the same place - not a very efficient solution. We'll be screwed if there are lot of concurrent visitors!

I am proposing to rework the core engine to accomodate more scalable solutions:

  1. There should be a seperate "uploads" folder. Users shouldn't upload directly into the "albums" folder, rather they can put the new albums in "uploads". Then through admin panel, or perhaps via cron, we can import the newly added albums into the main "albums" folder. Nobody, except the zen engine, should mess with the "albums" directory. The reasons are described below.

  2. The "albums" directory structure should be made scalable, so that it can efficiently store large number of albums and images. Instead of storing thousands of albums in the same directory level, we can store them based on [b]depth[/b]. This is how I'd implement it (ala cache-lite):

Let's say I have uploaded a new album in the "uploads" dir. And now zp is importing it to the main "albums" dir. Let's assume the name of the dir is [b]"New Album 1"[/b].

zp adds the directory name ("New Album 1") to mysql table as the album_title;

then we generate a hash value from the dirname/albumname:
$md5 = md5($new_album_name);

Let's assume the hash value is "abc123". This will be the key for the new album, and we'll compute path of the album from this key.

Let's assume we want to store the albums 3 levels deep into the "albums" directory.

$scalable_dir_depth = 3;

So we deduce the path of the album from its hash key ("abc123") in this way:
`

/albums/a

Member
Member
invarbrass   07-06-2008, 06:03
#2

There's some caveats I forgot to mention about generating unique hashes for duplicate albums: How do we handle sub-albums?

Here's how:

1 The new album has no sub-albums: we can safely use the algorithm, no problems.


2 The new album has sub-directories:


/New-album-1/sub-1/sub-2/sub-3
In this case, we compute unique hash from the deepest directory (sub-3)
I.e, we calculate hash values for these directories as-is:

  • New-album-1
  • sub-1
  • sub-2
    It doesn't matter if the above folders are present in the "albums" directory.
    But for [b]sub-3[/b], we'll generate the "duplicate-safe" hash value.

How do we handle sub-albums at the presentation level? That's for the developers to figure out... ;-)

Perhaps we could introduce a "parent_id" field into the mysql table and define a master-detail relationship for the sub-albums?
SELECT ... FROM ... WHERE parent_id="x";

Gonna be pretty complicated.

Administrator
Administrator
acrylian   07-06-2008, 09:28
#3

I am actually not the expert on these things, so I would let my fellow developers answer that. But actually Zenphoto should be quite scalable: http://www.zenphoto.org/2007/12/installation-and-upgrading/#6 (statement from our project leader Trisweb).

Member
Member
sbillard   07-06-2008, 15:58
#4

Also, your proposed change violates a fundamental premis of zenphoto--that the files in the folders are the defining element of the gallery and that the database only holds "meta data".

Member
Member
invarbrass   09-06-2008, 07:01
#5

hmmm, I didn't know that. But then, I've been tinkering with zp only for a week!

Anyways, I've rearranged my gallery. Instead of placing all galleries at the same level, I've put them under different sub-albums.

Interestingly, I had also experienced mysql crash and corrupted zenphoto database. I had to drop the database. Thanks to zenphotos folder-based approach I just had to do a simple setup.php!

So I'll be sticking with the default zp distro, no crazy scalability hacks for me! ;-)

  
Powered By MyBB, © 2002-2026 MyBB Group.
Made with by Curves UI.