Origins of Ord
This is a story about the making of
, a tool for linguistic inquiry.
Everyone knows Wikipedia is a fount of knowledge, but there's more magic than meets the eye.
On every Wikipedia article page, the sidebar contains a link to the same article in other languages.
Let's open the browser console and play with this data.
Cool, Wikipedia uses jQuery, so we get it for free in the console.
Let's get that data!
Time to switch from the browser to the editor and build a scraper:
, a tiny implementation of core jQuery designed specifically for node.
Yay! Now we have a standalone scraping module. But something's missing...
Each translation result has a language code, but that's a bit opaque. What the heck language is
its own system
for codifying langauges, based loosely on other ISO and IETF standards.
So let's scrape that HTML table and stick the data in a new node module:
Now we have two node modules that provide all the wikipedia data we need. The next step is to a build a small web app that wraps these modules.
is tiny. It has but one route.
And thanks to Express, turning Ord into a JSON webservice is a one-liner.
Works on mobile too, with a live media-query love.
One last thing: Let's try to sort the results by etymological similarity.
There's an npm module for that.
between two words is the minimum number of single-character edits (insertions, deletions or substitutions) required to change one word into the other.
We can use it to determine how similar (or different) two words are.
Wikipedia is amazing.
Node makes it easy to build complex projects composed of smaller modules, each with their own set of responsibilities.
Smaller modules are easier to
The web browser is more than a means for consuming information: It's also a tool for collecting and manipulating data.