Sunday, August 07, 2005

Irises - Van Gogh

I just created a counted cross stitch chart of Van Gogh's Irises and put a copy in my hidden stash. Creating cross stitch patterns is tricky business particularly if you're lazy like me. I don't want to edit the image, perferring to just upload it and let the software iron out the wrinkles. There are several guidelines I use to pick candidates for charting. First is to avoid faces if possible. Sometimes they come out alright. But since the human brain has a special portion of itself set aside just for facial processing, faces don't tend to chart well. The reduction in the number of colors, even with an extended set of blended threads tends to reduce the quality beyond which we will accept. Second is to pick an image with a topic relevant to the target market. Lighthouses, wedding, nature, cats, etc. Impressionist art is my personal favorite. Then pick an appropriate background color if possible. White is the best background color. I'm not afraid of being minimalist and not throwing stitches over every weave on the canvas. Then drive the complexity of the image by determining the stitch count and the use of blended threads. A high thread count makes for alot of threading changes. A limited thread count makes for less detail and realism. A cartoon type of image is ideal for low thread counts. The same with black and white or grayscale images. If you decide to exapnd your pallette with blended threads, the complexity of the chart increases as well. I like to look for images on Google. There are other image search sites as well. The isolated stitch count is important too. If you want to make the chart easier to stitch, raise the count to something higher like 3 to 5. This will make the chart easier to stitch at the cost of reducing the detail of the image. You can achive maximum detail by setting the isolated stitch count to zero. I recommend zero if you edit your image before uploading it to preserve the human touch. Even though I'm too lazy to manually edit the images I use, software cannot replace the benefit of human editing. More on that another time.

Saturday, August 06, 2005

RSS vs. Atom

I was looking at implementing a syndication feed for our catalog the other day. A XML feed allows data to be transferred between entities without the formatting text included. Or you can some of it, if you really want. It's structured data that can be easily parsed by software. HTML does not have this characteristic, but XML does. This blog for example has a feed. This was brought about by my seeing an increase in websites that include the "alternate" tag. I like to be the first craft site on the block to add a new feature, so I looked into it. The big decision is the feed format. There are currently two strongly supported formats for syndication. The first is RSS, which stands for Really Simple Syndication, or something else depending on who you talk to. RSS has been around for some time and you can view its history by clicking on the aforementioned link. The other format is Atom which is being pushed by Google. Both formats have specifications by industry standard groups and both are supported by a wide variety of different software programs.
But which is better? I decided on Atom. My reasons reach beyond just what Google tells me to do and has more to do with why Google is backing it. I think that the reason Google favors Atom over RSS is that Atom is better. "RSS is too complicated" is the basis for that. This is an over generalized statement but I'm hoping to boil the argument down without delving into overly excruciating detail. I have seen the new RSS 2.0 format, and yes it is simpler, but not simple enough and allows for considerable abuse. RSS is based on RDF, which is a really good way to store data and is used in software like Mozilla which is one of the reasons it is a really good thing. The downfall of RSS I believe, is that it allows for extensibility via RDF. This is normally a good thing and a very powerful feature. But when trading data between websites, it is unecessary. Since both websites must agree on the XML format prior to transfer, there is no way for the receiving website to interpret the tags that have been added to the format by the sending website. This doesn't make RDF or RSS bad, rather it means that they are not ideal for the use for which I'm intending. Atom lacks the overhead of support for tag extension and as a result is simpler.
On the internet, there is a trend that is usually true. In a conflict of two standards, one will win. Not all the time, but most of the time. By win I mean widespread adoption at the expense of the other. I think that RSS will be supported solely by some websites for a long time but will be the increasing minority. Many websites will support both for a while until they drop RSS and only support Atom to reduce maintenance. Some sites, probably the majority, will support just Atom. The pressure of market forces will apply increasingly over time towards Atom.
So I settled on Atom. I didn't do the actual implementation though. I don't store the last modified time for products and categories in my database which is a required tag for both formats. Including the last modified time was something I thought about during design, but discarded it as I couldn't see the use of it later on. My mistake. As soon as I add the extra columns to my database, I'll be able to finish the feed. Hopefully in time, product feeds from different sites will allow consumers to accurately comparison shop without significant overhead. I'm tired of the tab delimited format standard. Which makes me wonder, why doesn't Froogle take Atom?

Tuesday, August 02, 2005

Raster to Vector Conversion

There was a story on slashdot.org about different vector image editors. The story seemed to be brought up do to the ensuing hype about Firefox supporting SVG in a future release. Vector graphics support in web browsers is exciting to me since I have been working on raster to vector conversion software in Java for counted cross stitch patterns for so long. There appears to be a general consensus that a raster image can't be converted to a vector image. There isn't even an entry on wikipedia to any information regarding conversion. I've begun to think that there is a possible market for a service or product to do raster to vector conversion with the advent of browser support for SVG. There is an immense pool of raster images on the internet, but very limited support for converting them to vector images.
A search on Google displays a large selection of articles regarding region growing and image segmentation. The feeling I get is that the raster to vector conversion has been thought of as too processor intensive or RAM heavy to use brute force computation to solve. This might be no longer true with modern hardware. There is also an emphasis on real time processing of images which is highly time bound that doesn't fit the same market as the conversion tool I'm thinking of. Although admittedly, photo-realistic images in a vector format is not an ideal fit. But if you get past the idea that photos will get watered down by the conversion, or the file becomes too large, it just feels right to store all images as vector based. I can't back that feeling up with any evidence or hard numbers unfortunately.
Converting to vector images begs a couple of questions though. Number one is, can meaning be drawn for the vectorized image and stored for searching. I haven't solved this issue in my mind yet. It's easy to pull out lower end information from an image like colors and simple shapes. But pulling out "horse" from a picture of a horse is something else entirely. I know it can be done because my brain is doing it right now, but reproducing it in software is a something else. The second issue is video. This is the simpler of the two problems to solve. Video would be very processor intensive to encode it. The decoding process would be trivial though and could lead to smaller file sizes than MPEG possibly. I took a look at the format for MPEG and was stunned by how convoluted it is. I'm thinking that the industry is willing to take a hit on the resources and time it takes to encode a video in return for trivial decoding and a slightly smaller file size. Smaller, since only the difference between each frame would be stored to render the next one without dropping a refresher frame in every fourth time along with all the "super block" hoop jumping. If you look at what Macromedia has done with Flash, this approach just seems right too.
My primary inhibitor to working on this is hubris. I'm not Einstein in a patent office. I sell crafts on the internet. Who do I think I am to pull off something that others have dismissed as implausible? I've been looking around for someone who is doing this, and doing it well and I haven't seen it yet. If you have, let me know.