HeavyJobs

OurWork Edit-chalk-10bo12.png

What (summary)

Manage long-running jobs on available compute resources (servers) using db tables to keep track of work, and inter-process communication to keep track of workers.

Why this is important

We will use this infrastructure to manage our algorithmic data collection. This is a strategic direction for the company.

DoneDone

We will be satisfied with this infrastructure when:

  • we can launch, balance, and diagnose all steps of our pilot whois refresh path.
    • fetchers
    • parsers
    • aggregators
  • we have startup scripts that will resume proper job processing after a machine reboot or other operational events.
  • we can monitor overall health and productivity of all heavy job processing through a web interface.

Bugs and Todos

(new items)

  • Detect when worker goes dark > 2 min. Record last status in chunk; terminate and restart.
  • from feed_aggregator: :error=>"private method `log_error' called for #<0xb7e9c318>


Retrieved from "http://aboutus.com/index.php?title=HeavyJobs&oldid=15425162"