AdSense Dashboard
Social Networks

Entries in flickr (7)

Sunday
May222011

In Search Of Quality

Well, we’re getting back on track.  I’ve spent some time trying to return us to our previous levels of quality, and enable us to move forward on adding longer blacklists again.

I believe I’ve been able to return us to the previous state of affairs successfully, now - but the cost isn’t cheap.

The solution depends on the fact that we do have a known sort order for all returned results - interestingness.  If we assume that interestingness, as a score, is stable and not random, that should mean that our query can be split not only by positive term, but by negative terms, and taking the intersection of results for a particular positive query term; if a photo doesn’t appear in all of the split queries for a positive term, then it must contain one of the negative terms, and can be discarded from the result set.

This means, again, we’re doing a lot more queries. Specifically 6 locations * 2 years * # of positive search terms * (# of negative search terms / 9) ~= 1.5k Flickr API requests per search result as a worst case. (This used to be a lot higher - I’m having to bring down the search range in order to allow for more queries per positive term.)

Now, we do a lot of caching along the way whenever possible, because, again, this is a free service on freebie quota, but that’s a lot of outbound API calls per inbound user request.

Hopefully things look better again.

Monday
Apr252011

Weathrman Search Quality: A Progress Report

I’m in the process of doing some server-side results joining, to reduce the complexity of our queries.  The upside of this is that I should be able to, over time, bring us back up to our previous quality of searches.  In fact, we’ll actually be slightly better off - performing a series of decomposed searches will ultimately result in more search results to process, as we’re doing more queries and getting more results.

This mean the load on the flickr API - and the load on my server - are going up.

I’m still in the awful position of not being able to charge for this, having made it free once - so I still want this to fall within freebie quota if at all humanly possible.

Last, I’ve got a shiny new Xoom, and I’ll be spending some time on the client shortly getting some basic changes into the UI to make it slightly prettier when used there.  I’ve already made some server-side tweaks to broaden the minimum image size so that Xoom resolutions get more Flickr results.  The biggest problem at the moment is that there just aren’t many large image sizes fetchable via the API, and the images that do come back are far below the 1920x1408 background image size that a xoom wants.

More soon.

Monday
Feb142011

Sick days, Weathrman, and you

So I’m home sick today, and going through the laundry list of things I haven’t gotten around to recently.

Weathrman is on that list.

I’ve pushed a new version of the server that performs searches on behalf of app users up.  This is designed to do two things:  reduce the cost of running the service, and improve the search quality.

In order to reduce the cost, I’m going to do the obvious thing:  I’m going to produce fewer searches.  Specifically, that means:

  • Instead of asking for a 3-hour window, we’re asking for a 4-hour window around ‘now’.
  • Increase the search window from 3 months to 4 months (2 in either direction)
  • Fetch 4 pages of results instead of 5.
  • Reduce the minimum number of results needed at any search level from 5 to 3.
  • Remove street-level searching (level 16) - it too rarely has results, and just creates latency.

 This means our worst case goes from being 7 tiers of 5 parallel searches to being 6 tiers of 4 parallel searches.  It also means the average case, which cost 1 tier of 5 searches before we were likely to find any results, will now get faster, and I won’t be paying for that wasted time.

The downside is that I’m only fetching four pages of results; searches cover the entire four-month window, and it’s possible that none of the first four pages will have photos taken within the required time-of-day range; if that happens, we skip up a level.  Dropping one of the pages of search results makes that more likely, and it’s probably not completely mitigated by the widening of the 3-hour window to 4.  The net effect is that searches may feel… a little less local.

Hopefully, these changes mean I’ll be able to continue supporting this service for longer, and more cheaply.  Going back to the days when the client was responsible for performing all these searches just isn’t going to happen - it’s just too convenient to be able to run the searches from the server, and much more reliable.  It does mean that I’m bearing the cost of a free app - but as long as I can keep the costs down, I don’t mind.

Saturday
May152010

Weathrman 3: The Weather Cloud

One of the big problems with Weathrman’s current implementation is that the whole implementation lives in your phone; a worst-case search can trawl through literally tens of thousands of search results, searching for an image relevant to your weather conditions and time of day.

Many of those searches are common to others; city-level searches are the same all over London, for example; local searches are the same for everyone sitting near me.  Much of this can be cached aggressively, massively reducing the amount of time it takes to get good results, and allowing me to do more searches, more often, at less cost and lower latency to end users.

It’s not ready yet - I’m nearly ready, probably another day away or so, and I’ll probably test it out for a week or so.  Come Google I/O, though, I’ll be ready to ship.

It’s notably faster, and pushing the image scaling to the server has resulted in massive improvements in image quality, while turning hundreds of RPC calls to flickr into a single call to the weatherman service.

Monday
Feb012010

Weathrman 2.5: In the mix

So I’ve made a few more small tweaks since the 2.x series started, the biggest of which is in who provides our weather data.

Yahoo’s feed is damn good; it has flaws, but it also has huge benefits to us - the most important of which is that Yahoo’s weather API provides current conditions and the sunrise/sunset times in your location.  This means that as of 2.4, we started preferring photographs of sunrises and sunsets whenever the weather was clear or cloudy.

As of 2.5, we’re again increasing the number of queries we perform on a search, and that’s going to increase the amount of time we spend updating; but the upside is that we’ve got more images to choose from, and as of now, we’re going to stop preferring what Flickr thinks is interesting, and start randomly selecting from the set of selections we have at the nearest location to you we can get them.

For a while, I was seeing the same photos, day after day - now, I don’t think I’ve seen the same thing twice.  At the moment, I have a particularly beautiful view of Leicester Square, taken by maistora.

I look forward to hearing your opinions on the new version.