Overly Complicated URLs

Thursday, November 29, 2007

After failing at my database design, I decided to take a step back and work on easier issues first. How should the new URLs look? The following are my thoughts mostly in the order they came. This is not a presentation of results, but a look at how my brain weaves its way through different mazes in an (in this case unsuccessful) attempt to arrive at a solution.

I started with this given: all old URLs will continue to work, at least for the older posts so as not to break anything. I decided to move away from the current URL design because for the most part it doesn’t provide useful information.

My first attempt:

http://sewcrates.com/Archive/2007/Database_Shmatabase/

http://sewcrates.com/Archive/2007/Database_Shmatabase_2/ (for duplicate titled entries)

http://sewcrates.com/Archive/2007/Database_Shmatabase_2/photo_001.jpg (the URL for a photograph uploaded and referenced in the above post)

To complicate matters, here’s the URL for an individual post for the img_013647.jpg file. This allows photos to exist outside of their photo albums or posts, so that tagging can be used to sort and display them. I’ll use the same process for doodles, and movie and book lists and reviews. I will also need to automate the creation of the albums. Many times I do not have the patience to do much more than upload lots of photos and assign a name and global tag. I want the ability to go back (or have Doolies go back) and individually name and tag the photos at a later date. (This is how the current version of sewcrates works, although most of the photo information is contained in the album and not tied to the individual photograph.):

http://sewcrates.com/Crazy%20Monsters/2007/img_013647_jpg/

I’m still not decided on the year part of the URL. In the current sewcrates, I used the full date format, along with the category (what will now be tags), e.g., http://sewcrates.com/Archive/2007-11-26-15:21:36/. I realized that most people do not care much about the full date. I used the full date, including the time (and microsecond, because, you know, sometimes I posted twice in the same tenth of a second), to allow multiple posts within a single day. The original version of sewcrates.com supported only one post per date: http://sewcrates.com/2007-11-26/ (at least that’s how I remember it).

It is a bit strange to see the year in the URL in this way. For example, http://sewcrates.com/Crazy%20Monsters/2007/img_013647_jpg/ may be followed by http://sewcrates.com/Crazy%20Monsters/2008/img_019234_jpg/, which is a bit confusing from a logical perspective. With that said, when I am reading the internet, I do enjoy glancing up at the URL to see what year the post was made. This quickly clears up whether this is something brand new, or a very old archive. The month and the day, while important, do not fulfill this purpose, and make the URL unnecessarily complicated (at least more so than it is currently). Alternatively, I could just use http://sewcrates.com/Crazy%20Monsters/img_013647_jpg/. This doesn’t help with identifying the year of the post—but the post itself will provide that information. So many decisions to make!

The other advantage to the single year is the directory structure. While I attempted to use the database to store my doodles in castofhorribles, I realized it was not efficient, from both a storage and speed perspective. (I now save all my doodles in both file form and in the database. While I’m able to read both the database and file versions of the .png files, I cannot read the .ai from the large blobs in the database. This may be a MySQL limitation on large files (my .ai files are usually >1.5mb). I haven’t done the leg work to check on this, since I don’t use any of these tables to display the files. This may change in this or the next version. But that’s a much longer discussion.) I want to store all files in a directory. So, continuing with the above example, I will store the jpg in: http://sewcrates.com/2007/img_013647.jpg. The advantage is I am creating fewer directories. The disadvantage is that there will be a larger number of files in each directory. I don’t see any advantage either way. I’ve almost argued myself into returning the month and date to the URL. It’s an easy enough switch once I start programming, so I’ll leave this on my list. For the record, my doodles are stored in http://castofhorribles.com/doodles/. It’s one large directory with all the files. Since I guarantee a unique filename for each doodle (I base it on a simplified version of the title, and add “_n” if it’s a duplicate), there should never be a conflict in the directory.

As I think more about this (man, this is becoming much more of a planning exercise than I originally realized), I’m thinking of moving the category away from the first part of the URL. I included it in the URL for the current version of sewcrates because it was the easiest way to provide for stepping through the different posts. For example, you could use the next/previous links to move through all of my http://sewcrates.com/Writing/ posts, such as http://sewcrates.com/Writing/2007-08-21-00:00:00/ to http://sewcrates.com/Writing/2007-10-28-23:14:16/. In the next version of sewcrates, there are larger issues I need to contend with. What if a user wanted to see all photos with Doolies in an album format (similar to how I use tags to show multiple Horribles: http://castofhorribles.com/tags/Doolies/) There are two ways of providing that: the first is to open it in a large album, something like: http://sewcrates.com/Photos/Tags/Doolies.

I also have the option of using variables in the URL: http://sewcrates.com/2007/?tags=albums+julie;years=2007+2006. (You can do this two ways: present this as a URL, or convert this complicated URL into a slash-based one in the .htaccess file. That’s how castofhorribles works.

I’ve written so much and decided to come back around:

http://sewcrates.com/img_013647_jpg/tag1/tag2/tag3/etc - where tags can be search terms: “years=1973-1976”; this will show one image, and the next/previous buttons will allow you to step through other posts that meet the tag criteria in chronological order.

http://sewcrates.com/album/tag1/tag2/tag3/etc - this is more complicated. Here is how it will work: For albums that contain only one type (and the type will be included as a tag, e.g., Photos, Books, Cast of Horribles, Movies, Musings), then I will present a thumbnail version of the page. Thumbnails for photographs and doodles are the smaller version of the picture; for text posts, it’s just the title.

http://sewcrates.com/list/tag1/tag2/tag3/etc - this is the thumbnail for multiple types of posts, e.g., provide on one page, all the thumbnails with Doolies, which would include thumbnails (if available) for photographs, doodles, and posts. (How it will be formatted will be a huge challenge. I think I did a good job with the photo thumbnails in sewcrates and the indexed thumbnails in castofhorribles because they were all of the same size. If we throw in text titles and different sized thumbnails, I’ll run into huge problems. But, again, that’s for another time.)

http://sewcrates.com/about/ - the URL for these “special” pages will remain the same.

I will store all files in a single /data/ directory, controlled by a simple name with the same rules as for castofhorribles.

That means I’m dropping the year from the URL. This provides more flexibility with only a small loss of readability. The years will still be available in the posts themselves. I’ve said a lot (most of which I later contradicted), and programmed very little.

Everything is a tag. The tags may have names: location=Seattle, WA. That’s a tag. This enables you to create a post with a Seattle, WA title and not confuse the system. For titles, you have title=This is a nice title. And then you create the simplename from that title that is used in the URL.

This allows me to create albums by referring back to titles. This is complicated, though. Wouldn’t a date be a tag? And in this case, I might be overloading what I mean by tags. A tag should not be so complicated. It’s just a list of items. I could yank out all the specific tags. I’ll use it for two purposes: to categorize and to provide information. That way everything is contained within a single table. But to what end? It allows me to change and add templates without worrying about changing the database. Is that helpful? I’m not sure.

As I continue to think about this, other issues pop out. Do I want to have the flexibility to have parent/child relationships between photographs (or assets, as Moveabletype calls them) and larger posts? Or do I want to tie them together through tags of tags, i.e., tags where there is a “type” and a “value” field, such as “title=This is a long stupid title.”

Ugh, my brain hurts.

 Seattle, WA | , ,