My daily readings 08/12/2012

  • tags: Shopping

  • tags: programming

    • And then all hell broke lose. Many of those boss’s boss’s  bosses were seriously pissed at me, though no one could  or would articulate exactly why. Some of my co-workers started treating me like I had contracted an infectious disease. Slowly, I figured out that I had jumped into the middle of some complex interdepartmental power struggle. In my own clueless way I hadn’t sped up the graphics so much as I had supplied a war winning weapon to one organizational faction and the other factions were not happy. Eventually, grudgingly, they did away with the second process along with the socket and we got our faster graphics. Nevertheless elation seemed thin on the ground.
    • Every time some annoying new hire comes to me with a dumb idea that is obviously not going to work, “Stay the Hell out of other people’s code,” plays back in my head and I listen harder.
  • tags: crawler

    • I carried out this project because (among several other reasons) I wanted to understand what resources are required to crawl a small but non-trivial fraction of the web. In this post I describe some details of what I did. Of course, there’s nothing especially new: I wrote a vanilla (distributed) crawler, mostly to teach myself something about crawling and distributed computing. Still, I learned some lessons that may be of interest to a few others, and so in this post I describe what I did. The post also mixes in some personal working notes, for my own future reference.
    • Code: Originally I intended to make the crawler code available under an open source license at GitHub. However, as I better understood the cost that crawlers impose on websites, I began to have reservations. My crawler is designed to be polite and impose relatively little burden on any single website, but could (like many crawlers) easily be modified by thoughtless or malicious people to impose a heavy burden on sites. Because of this I’ve decided to postpone (possibly indefinitely) releasing the code.

Posted from Diigo. The rest of my favorite links are here.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


%d bloggers like this: