For Developers

Reverb Developer Blog

doris day

We admit it: mistakes were made.

The front end of Reverb for Publishers (RFP) is a JavaScript application that can display links relevant to the page you are viewing, concepts derived from the current page or recent trending articles on your site. This is powered by Reverb APIs which ingest the site content, analyze the text, provide recommendations, and monitor traffic. The bulk of our users are running WordPress powered sites but you can install this free plugin on any platform.

While RFP is now highly performing and increases page views by presenting related links on thousands of sites, we did do some dumb things during development. Here’s what not to do when creating an application that’s supposed to run in someone else’s webpage.

Don’t require interaction

Initial versions of our application had fancy interactivity we thought was inviting. We tested several radically different designs which required rollover to bring up more details or had stuff following you around the screen on the sidebar.

But when we gathered usage statistics the results were clear: the simplest implementations always won. It’s better to display all the information cleanly right away and not expect to entice viewers to mouse over or interact in any way for more information.

Don’t expect users to use configuration options

We built a lot of optional functionality into our application which could only be accessed via administration screens. Tracking the usage of this showed us that it was extremely rare for people who installed our widget to even open the configuration options much less experiment with all the extra functionality we made available.

The lesson learned here was that we could please finicky customers by providing advanced features but with new installs you live and die by your default configuration. Most people will just evaluate your app on those features and uninstall if it doesn’t fulfill their expectations.

Don’t think your in-house testing has any relation to real user experiences

When we started monitoring the aggregate speed of our application on real users, the results were shocking. Our little widget was reporting a load time of several seconds which we never noticed during development or in-house testing on live sites. Instrumenting the code execution path exposed some serious bottlenecks we had created.

One assumption we made was that jQuery was probably installed on a lot of the sites so the first thing we did was check for a relatively recent version and only load it if necessary, delaying subsequent code execution until it was available. We saw this process adding over a full second to the load time in aggregate real user data. Any synchronous operation like this caused a visible delay in our aggregate load time data.

We started by using Mixpanel for reporting which was a really nice way to be able to record and view the data in real time. We eventually moved to an in-house solution when we were able to reproduce the important Mixpanel features so we could report directly to our APIs and could remove the Mixpanel library from the client side code we were delivering to everybody.

Don’t depend on way too many libraries

The initial version of our application included several open source libraries which allowed us to develop features quickly. But when we got serious about performance, we found that removing all these libraries allowed us to reduce total app size by 70%. Admittedly we went overboard during early development including not only jQuery but Atmosphere, Backbone, Underscore, etc. We still use EJS style templates but we precompile them during our build process and then no longer need to include a templating library in the client side code.

In order to support our target browser set (Android 2.3+, iOS4+, IE8+, last 3 versions of FF and Chrome) after dropping jQuery we needed to change our selector syntax to use querySelector instead of $:

element = document.querySelector('.wordnik_discovery');

In order to support IE8 which still represents 5% of traffic we need handle lack of addEventListener:

addEventListener: function(el,eventType,handler) {
  if (el.addEventListener)
    el.addEventListener(eventType, handler,false);
  else if( el.attachEvent )
    el.attachEvent ('on'+eventType,handler);
  }
}

Don’t use defenseless CSS

Our design strategy with our widget was to let the host site’s styles shine through so we made sure not to override the colors of anything or type attributes. This worked pretty well and we were able to fit into a variety of designs smoothly. There were of course some styles that we need to be able to control and we were careful to tie important layout features like the float layout of template classes back to an ID reference for maximum specificity. But we weren’t careful enough. Site owners will do some things that even in retrospect seem difficult to have anticipated:

a { white-space: pre-wrap; }

This seems like a pretty strange rule but we actually encountered a site with this applied. Here is an example of the consequences. I overrode the pre-wrap with white-space:no-wrap on the middle item:

Don’t expect users to install updates

Although WordPress users are presented with update notifications when they view their plugins and updating is a one click process, a significant percentage of our userbase has better things to do with their time. In order to actually fix problems we encountered we designed our widget to load the main application code from our servers asynchronously on domready so we were able to push updates to our entire install base.

Don’t deliver unnecessary templates

Fully JavaScript-powered apps are able to take advantage of beautiful templating engines, but they usually deliver all the possible templates to the client whether or not they get used. In our case we were delivering all the optional templates when we knew a site could only choose a single display option per widget.

To avoid delivering unnecessary JavaScript template code, we’re currently moving towards the API response providing HTML fragments instead of pure data in a JSON object. Shifting the creation of the HTML fragment to the server allows us to slim down the delivered JavaScript app code by removing all previously client side templates. This would have created some difficulties if we weren’t able to use similar js templating code server side, but we are able to execute Mustache in a variety of languages and can get even better templating options by using Node.js to transform EJS style templates which provide the flexibility I personally find most comfortable.

Don’t have insufficient JavaScript encapsulation

WordPress can be a bit of a bad neighborhood front end wise so it is a wonderful place to see how your code will fare in unusual situations. It is not uncommon to see JavaScript errors caused by WordPress plugins on live sites whether from the plugin’s front end code, or a mishandled configuration. We were pretty careful to do no harm to the sites where we were installed so we didn’t add to the confusion. I’m not saying we were blameless but we followed recommended practices to keep our JavaScript unobtrusive. Everything is wrapped in an immediately executable function and our app is a single global object with all the required functions attached to it.

(function(){
var WRC = window._WRC = { version: "0.6.5"};
_WRC.extend(_WRC, {
  DOM: {...
})()

[Photo Credit: CC BY 2.0 by velvettangerine]

{ 0 comments }

A Day in the Life of a Reverb Dev: Stew O’Connor

by admin  |  Posted March 27, 2013

Continuing our series on the lives of Reverb developers, today we talk to Stew O’Connor, Senior Backend Engineer. Stew tells us about his fondness for Emacs, his favorite music, and life outside of Reverb (hint: it involves beer and princesses).

What’s your favorite coding editor/IDE? Why?

Emacs 4 Life. I’ve never spent much time with anything other than Emacs. I’ve been using Emacs daily for more than 20 years now, and it’s hard to imagine that I’m going to switch now.

When I’ve tried to use a “Real IDE,” I always feel like I’m being asked to spend way too much time fidgeting with the IDE instead of just editing code with their sub-par editor. I spent 10 years as a Java developer eschewing IDEs. Now that I can have Emacs + ENSIME, my Emacs experience has been so greatly enhanced that I don’t mind that ENSIME is often wonky.

What’s your beverage of choice?

Espresso.

The best lunch within five blocks of the office?

Fish tacos from Rubio’s. However, I’m not usually in the San Mateo office. In the Kentucky office* where I’m usually working, I make myself huevos rancheros with eggs from my neighbor’s chickens.

What’s your favorite music to listen to while working? And your sound system of choice?

It would be lots of roots reggae, but my wife (who is my office-mate) won’t stand for it. We’ve settled in the middle ground of listening to lots of DJ sets from the UK, leaning heavily on breakbeats. We listen to a lot of Annie Mac radio shows.

Favorite sound system? That would have to be the Jah Shaka Sound System out of London! (Although I’m sure this was not the intended meaning of the question :))

What are your favorite languages?

I love functional programming, and have fallen in love with Scala. I loved SML when I learned it in college and I wish that I had found an excuse to learn OCaml, but I never had. When I want to do some quick scripting, I’m probably reaching for Python. I wish I had the time and the need to become a better Haskell developer because it would make me a better Scala developer.

Which language do you think is terrible?

PHP. See here.

What was your first language? When did you learn it?

I started with BASIC in the ’80s. Both my parents were computer programmers and they got me started early.

Where do you go for your tech news?

Twitter/IRC.

Where do you go for help?

IRC.

What’s the best thing you’ve read about coding lately?

Functional Programming in Scala by Chiusano and Bjarnason has so far been great. I need to devote some time to making myself read more and work through more of the exercises.

What’s the worst thing you’ve read about coding lately?

That the Lambda Project which is bringing lambdas to Java 8 will have an Optional class like Scala’s Option, but they decided not to add map/flatMap methods. Actually worse is that they had a map method and decided to remove it.

What’s your favorite book/movie/TV show?

Book: Flatland by Edwin Abbott.

Movie: My go-to answer has always been Apocalypse Now, but lately a more accurate answer is probably either Primer or Idiocracy.

If I’m watching TV, I probably want to see dumb people making bad decisions, so it’s Workaholics or It’s Always Sunny in Philadelphia.

If you had a time machine, what would you go back and tell your younger self?

Don’t leave your record collection at college when you go home for the summer – you’ll never see those records again.

If you weren’t a dev, what would you be?

A cook.

What do you like to do when you’re not coding?

Brew beer, knit, ride a bike, though lately most of my non-coding time is spent discussing details of the lives of various princesses with my three year old daughter.

What’s your strategy for hitting the bell in Reverb’s Friday contest?

Use the sights on the gun to line up with the bell as accurately as I can, then randomly aim a little farther to the right and up of that. As I have the highest win percentage** of anyone in the office, you can determine that either this is an excellent strategy, or that the sample size is too small.

*Also known as “Stew’s house.”
**One out of three tries, or 33%.

{ 0 comments }

Dissecting Our Wordnik Developer Site

by Tony Tam  |  Posted March 14, 2013

wordnikapi_logo

To those familiar with the Wordnik developer site, it looks unchanged. There’s a section for code libraries to connect to our API, examples of applications using the API, and of course the Swagger Sandbox. Functional, simple, and easy to navigate.

But just the other day, the site was totally rewritten in HTML, CSS, and Scala. And the lines of code dropped to 1/10th what they were before. What’s the deal? And what makes it so much snappier now?

We’ve updated the site to follow an emerging web architecture called a Single Page Application, or SPA, which allows the browser to serve all display and rendering logic. The UI, via JavaScript, has plenty of power to decide what to show the user, and keep state as needed. The HTML is more semantic – meaning the layout follows the content more than the display logic (no more <table> overuse!) and CSS can respond to media queries, meaning small-form devices or different screen layouts can switch rendering templates without reprocessing data on the server.

As an added benefit, the consumer (your ultimate judge) gets a snappier, lighter weight experience which can even work offline. Sounds great, right? So how is this done?

If you view source on the developer site, you’ll see that it’s pretty much empty. Aside from some JavaScript and CSS files, there are just a few <div> elements which serve as basic containers for the application: a header, a content section, and a footer.

site-source

As the page loads, the JavaScript will call the API server (powered by Scalatra) and load a chunk of data. You can see that happen in the chrome’s network panel. Once that data is received, it’s inserted into the DOM under the header div. Let’s look at that more.

xml-source

Note here, the response from the API server is HTML, not JSON. Why is that? Well, we’re keeping the client-side rendering as light-weight as possible. The server is extremely good at efficient generation of XML. The client, however, has to keep state of the DOM during manipulation (there are tricks to lessen this burden, of course), plus this header almost never changes.

So why bother put in the rendering logic on the client? All that’s required is a simple call to `load` the content into the DOM:

js-inject

Same goes for other sections. We’re simply removing and replacing entire DOM elements – big ones – with pre-computed, rendered HTML. No fancy JavaScript is required for this part of the site, and the entire client-side code including comments is only 142 lines before minification. There’s really not a whole lot to go wrong in that code.

The server is just as simple to write. Using Scalatra, we simply generate XML and send it over an HTTP get call.

A trait defining the header, in XML:
common

Mixing in the trait into a singleton object:
server-service

Finally, the GET method to support the request for the header HTML:
scalatra-api

Of course, backing the XML generation with a database call, fancy logic, etc. is trivial. As you might guess, the code to generate this site, both client and server, should be tiny. And it is – in fact, the server codebase to generate developer.wordnik.com was cut down to 1/5th the original size, while the templates, JavaScript, and CSS were cut to 1/20th the old implementation.

Now for a simple site like developer.wordnik.com, do you need to do fancy techniques like this? Of course not, but we’re developers, and what better place to show what we consider to be best practices than our developer site, right?

{ 2 comments }

Filed under: api, scala | Permalink | 2 Comments

API Strategy Conference Recap

by Tony Tam  |  Posted March 1, 2013

Last week, I attended the API Strategy Conference in New York City, where a diverse group of techies got together to talk about the future of APIs for developers, service providers, and big enterprise.

It was a very open discussion about the way both computers and devices talk to each other over both public and private networks. 3Scale and API Evangelist @kinlane organized the event, and by all measures, it was a great success. The mood was open and honest, and a ton of great discussion happened during the talks, in the hallways, and of course, the bars.

While my talk was about Swagger, I was able to attend a whole bunch of sessions and have conversations with some very insightful people. I’d like to share some of the major themes I observed over those two days (I’m skipping the obvious ones, like “everybody wants APIs”).

Delight your Clients. Who are they, again?

While the immediate client is the developer, the ultimate client is the end user. Your APIs need to satisfy both, and that’s not always in-line with the “service-oriented” thinking of many back-end engineers. The use cases should drive the solution, and you shouldn’t cut corners.

The API versioning problem

Versioning APIs is an unsolved problem. It’s hard since signatures, payloads, etc. are both difficult to detect and provide as it typically means running more servers than you may want to. api500.com has a good list of how this is done, but let’s revisit that topic in a bit.

Developer-friendly Stripe has a very cool solution to this – an API version is associated with each integration. That allows their forward proxy (assuming that’s what they use) to route requests to the right place. Upgrading at your own pace is an excellent way to keep developers happy. Really.

API Discovery is fragmented and needs improvement

How do you find APIs? How good are they? Would you trust them in the middle of your application workflow? We need more than just a directory of APIs – we need both developer feedback and metadata about the API. How many people integrate with it? What’s the uptime? How are the drivers? How often does the signature change?

Folks like Singly and Webshell are building abstraction and unification layers to 3rd-party APIs. That certainly makes discovery easier, since the underlying services need integration and are therefore known. New services, however, need to be found, and the whole issue of developer feedback needs to be integrated.

Not all APIs are for fetching photos or tweets about cats

APIs can be simple. Zero workflow required (once authenticated), even if there’s complex processing happening behind the curtain. But that’s not always the case, and workflows are required.

This entails passing values from one API call to another. Often there are branches in logic, depending on what happened. This makes the complexity of the API documentation snowball. Who has nailed this? From the discussions, apparently nobody.

The enterprise is coming! And they are skeptical about services in their workflow

Even big companies with tons of resources need to open up their services to stay both competitive and nimble. Developing for mobile devices has driven this need very quickly – they have, though, some additional challenges.

Some are public companies. Many have gigantic traffic requirements. They might have scary SLAs and deployment requirements. The thought of sticking a multi-tenant, cloud-based service run by five smart guys from a coffee shop in San Francisco being put smack in the middle of their workflow is often a non-starter. They want control of the systems, backup systems, and a big red telephone for when something goes wrong.

Solving small shops (developers) and the enterprise is an almost intractable problem for most small companies. It’s very important to know who your target market is.

REST is not a silver bullet

There was a deafening silence in the second day keynote when Daniel Jacobson said that Netflix is moving away from REST. Whaaaa?

Backed by usage data (and lots of it), investigation, and solid engineering, Daniel made a very compelling argument against REST as a one-size-fits-all solution for your connectivity needs. The often-complex workflow required by clients can be accomplished in a single request to their API, which returns presentation data, not raw data.

They call it an “Experience API”, and there may be discrete endpoints for each device type. This gives server developers more freedom to change code without risking everything and makes a more streamlined request/response cycle to get presentation data. Remember in the first point, the end user is the ultimate client. So if you can make things faster and give a good or better experience, you are winning.

How about that API versioning issue?

Let’s talk about API versioning. It’s a big problem as I covered in my presentation during the API tools track. What happens when you change the signature of the API? How will developers know, other than their app breaking? Maybe you add a new API – how do your developers find out about it? Twitter? Diffing your documentation?

Turns out that what most people do is what was described by api500.com, and I have to say, it’s almost pure luck if you find out about an API change before it’s too late. Do you personally keep up with your twitter feeds? Never missed a message from @twitterapi? Read those developer emails?

Turns out Swagger can help with this problem. When you describe your API through Swagger, you can easily scan the API description and detect changes. Then it’s up to you to act on them. Here are some examples.

1. Model changes. Let’s say you have an API like the Swagger Petstore sample. This API describes a Pet model through the Swagger JSON:

pet

Here, the description field is not required. Say that changes, and you require it now – and if not present, you start returning 400 response codes and break all your consumers over that one change!

Detecting this change is trivial, by scanning the API Declaration for this particular API. It’s just JSON, and diffing the two is easy.

2. New functionality. Say you add a new operation to your API – how does the world find out about it? Again, change detection is quite simple once accurate Swagger descriptions are produced.

To keep people sane, it’s important to use computers wherever possible to do mundane tasks. While I love the Twitter API Docs, I really don’t care to read them every hour looking for changes.

And did you know that decoupled documentation systems (like Twitter’s) are typically updated after the API is changed? It’s true, software development is much faster than documentation. So what if we had an API directory which not only allowed user feedback on uptime, drivers, etc., but also could detect changes in the APIs it lists, and report back to the users who follow them? All you need is an interface to describe the APIs themselves – the rest is easy.

{ 2 comments }

Filed under: api, conferences | Permalink | 2 Comments

Wordnik API in the Wild: RapBot, by Darius Kazemi

by admin  |  Posted February 21, 2013

mic

We love to see creative uses of the Wordnik API, so when we noticed RapBot, “a freestyle 80s battle rap generator,” we were more than a little excited.

Today we’re talking to the creator of RapBot, Boston-based developer Darius Kazemi (he’s also behind the Random Shopper) about how he came up with the idea for RapBot and how he uses the Wordnik API.

How did you find out about the Wordnik API? Why did you choose it?

I discovered Wordnik on a different project: Metaphor-a-Minute!, which tweets a random metaphor every two minutes (thanks to Twitter rate limits, oh well!). I was looking around for some kind of dictionary that gave me parts of speech that I could parse and choose random words from. Learning about Wordnik was like Christmas to me: no more obscure academic software libraries that I need a degree in linguistics to understand! Just an API call: “give me a random noun” — boom, done.

Did you find the API first, and then decide to make an app, or did you have an idea for an app and find the Wordnik API?

RapBot itself was actually inspired by the Wordnik API. I was building wordnik-bb, a Node.js interface to the Wordnik API, and as I was researching relatedWords I noticed there was a new “rhyme” relationship available! That got me really excited and as soon as I was done writing wordnik-bb, I started playing with the “rhyme” feature.

What surprised you most about the API?

The best part about the API is how surprising the results themselves often are. For example, I just entered “rap” into relatedWords and asked for a hypernym in return (encompassing concept, basically). It gave me “African-American music.” That’s so cool! A lot of my work relies on serendipity and as far as I’m concerned, the Wordnik API is a serendipity generator. We’re hard-wired to find meaning in language, which is one of the core concepts behind my Metaphor-a-Minute! project. What that means is that something like the Wordnik API is always going to be surprising and delightful.

For example, let’s say I hook up a random number generator to a mathematical equation and make the output “[number1] + [number2] = [valid result]“. Nobody’s going to fall out of their seat when the output is “4.1 + 6.6 = 10.7″. But if I say “your [noun] is [adjective]” and pop it into Wordnik… let’s see the first result here… “your bicyclist is preserved.” In both cases I’m taking two random inputs and putting them into a valid arrangement, but the one that uses language is infinitely more interesting than the one that doesn’t. There’s a reason there was never a “Math Edition” of Mad Libs!

What other APIs do you use? What frameworks/languages, etc.?

I work primarily in JavaScript, both clientside and serverside. I use Node.js whenever I can get away with it (which is often). My background is in videogame development so I’ve use a lot of game development frameworks/engines like GameMaker and Impact. In terms of APIs, I like Yelp’s API (especially its ability to return a neighborhood name from a lat/long). I also use the API interface to Google Spreadsheets, which I tend to use as a basic database/CMS.

What advice do you have for others using the API?

If you’re going to use randomWord, make sure to play around with the min/max corpus frequency of the responses. It’s incredibly powerful. If you use low corpus frequencies you can make your generative text sound like a weird mix of erudite and juvenile. I just ran a few API calls with a maxCorpusCount of 10, which leads to some pretty rare words:

the embuggerance of the polyvalence is tittivated most biotically within the craterlet

Or we can put a very high minCorpusCount:

the child of the change is started most certainly within the energy

Whenever I’m composing with randomness, I try to get a feel for these parameters until the words seem like they’re in the right range.

Another thing to note is that each part of speech has its own frequencies and its own idiosyncracies: be sure to tweak all the parameters for each part of speech individually. And lastly I would recommend getting a good helper library like inflection-js, which can handle pluralizing and singularizing of words.

Final question: Swagger, best thing since sliced bread, or best thing ever?

I have a confession: I sometimes think about Swagger and get sad that it’s not used by more people. It makes my life so much easier as a developer. Every API needs a sandbox and documentation, and Swagger is the way to get it all in one go.

[Photo: "Mic," CC BY 2.0 by Robert Bejil]

{ 1 comment }

Filed under: api, swagger | Permalink | 1 Comment

Scalatra 2.2 Released

by Ivan Porto  |  Posted February 14, 2013

Last week we released Scalatra 2.2. This is our biggest release so far and introduces several exciting features and enhancements such as commands for handling input, Atmosphere for Websockets and Comet, a much deeper Swagger integration, and a completely upgraded API.

In short, this Scalatra version fixes most of the big problems we were aware off. Probably one of the nastiest of those problems was the fact that we were using thread-locals to store the request and response; when you then use a future or something similar, the request is no longer available.

Let’s walk through some of these changes.

Working around thread-locals.

In a previous version we had migrated all our internal states to either the servlet context attributes or the request attributes, depending on their scope. In this release we made everything that accesses the request or response take them as implicit parameters. For people overriding our methods, this is a breaking change but easily fixed by adding the parameters to your method override. We also added an AsyncResult construct whose only purpose is to help you not close over thread-locals.

So what is exactly the problem?


def skip = params.getAsOrElse("skip", 0)

get("/things/:id") {
  Future {
    params("id")  // throws NPE because request is not available when the future executes
  }
}

post("/things") {
  myThingsActor ? Post(parsedBody.extract[Things]) map { things =>
    if (things.isEmpty) status = 404 // throws NPE because response is not available when the future executes
    ()
  }
}

// assuming scentry is mixed in and user is something stored on the request or in cookies or something
get("/stuff/:id") {
  val stuff: Future[Stuff] = getStuff(params("id"))
  // everything is still fine
  stuff map { allTheThings => 
    getTrinketsForUser(allTheThings, user, skip)  // throws NPE because request is not available when the future executes
  }
}

And since this is something we absolutely had to fix, we had to introduce some breaking changes but they really were for the better.
Currently there are two ways to get around it: bring request/response into your action in implicit vals or use the AsyncResult trait to do this for you.

Let’s rewrite the broken examples in terms of the first work around.


def skip(implicit request: HttpServletRequest) = params.getAsOrElse("skip", 0)

get("/things/:id") {
  implicit val request = this.request
  Future {
    params("id")  // no more NPE
  }
}

post("/things") {
  implicit val response = this.response
  myThingsActor ? Post(parsedBody.extract[Things]) map { things =>
    if (things.isEmpty) status = 404 // no more NPE
    ()
  }
}

// assuming scentry is mixed in and user is something stored on the request or in cookies or something
get("/stuff/:id") {
  implicit val request = this.request
  implicit val response = this.response
  val stuff: Future[Stuff] = getStuff(params("id"))

  stuff map { allTheThings => 
    getTrinketsForUser(allTheThings, user, skip)  // no more NPE
  }
}

With the AsyncResult you get another chance to add some default context to your async operations but other than that it works in a very similar way.


def skip(implicit request: HttpServletRequest) = params.getAsOrElse("skip", 0)

get("/things/:id") {
  new AsyncResult { val is = 
    Future {
      params("id")  // no more NPE
    }
  }
}

post("/things") {
  new AsyncResult { val is = 
    myThingsActor ? Post(parsedBody.extract[Things]) map { things =>
      if (things.isEmpty) status = 404 // no more NPE
      ()
    }
  }
}

// assuming scentry is mixed in and user is something stored on the request or in cookies or something
get("/stuff/:id") {
  new AsyncResult { val is = {
    val stuff: Future[Stuff] = getStuff(params("id"))

    stuff map { allTheThings => 
      getTrinketsForUser(allTheThings, user, skip)  // no more NPE
    }
  } }
}

The AsyncResult has an implicit parameter of ScalatraContext and every ScalatraBase has an implicit conversion to a ScalatraContext so the request and response are now stable values and no longer stuck in thread-locals.

New Swagger API

In the previous version of Scalatra, we introduced Swagger support. However the API we introduced then ended up being very messy and error prone since most of it used strings. At Wordnik we started using Scalatra and one of my co-workers, who had just started learning the language, remarked: Swagger makes Scalatra ugly. Clearly something had to be done about this!

This release tries to fix some of these issues by using as much information from the context as possible and defining a fluent API for describing Swagger operations.

Now there are only strings for notes, descriptions, names, etc. Swagger API integrates with Scalatra’s commands so you only define the parameters for a request once. It automatically registers models when you provide them and converts the Scalatra route matcher to a Swagger path string. Let’s take a look at a before and after.

This is how it used to be:

// declare the models, and the models it uses
// case class Pet(id: Long, category: Category, name: String, urls: List[String], tags: List[Tag], status: String)
models = Map("Pet" -> classOf[Pet], "Category" -> classOf[Category], "Tag" -> classOf[Tag])

// declare the route with the swagger annotations
get("/findByStatus",
  summary("Finds Pets by status"),
  nickname("findPetsByStatus"),
  responseClass("List[Pet]"),
  endpoint("findByStatus"),
  notes("Multiple status values can be provided with comma separated strings"),
  parameters(
    Parameter("status",
      "Status values that need to be considered for filter",
      DataType.String,
      paramType = ParamType.Query,
      defaultValue = Some("available"),
      allowableValues = AllowableValues("available", "pending", "sold")))) {
  data.findPetsByStatus(params("status"))  // this is our actual implementaton, you might have missed it.
}

This is what it is now:

// declare the swagger operation description
val findByStatus =
  (apiOperation[List[Pet]]("findPetsByStatus")
    summary "Finds Pets by status"
    notes "Multiple status values can be provided with comma separated strings"
    parameter (queryParam[String]("status").required  // required is the default value so not strictly necessary
                description "Status values that need to be considered for filter"
                defaultValue "available"
                allowableValues ("available", "pending", "sold")))

// declare the route with the swagger annotation
get("/findByStatus", operation(findByStatus)) {
  data.findPetsByStatus(params("status"))
}

Endpoint declaration is no longer necessary. Instead, you work with actual types and no longer have to remember to register models and all their referenced models. In my opinion if you simply write the Swagger declaration close to where your route lives, the docs still live with the code, but now it won’t obscure the application code.

What do you think?

{ 1 comment }

A Day in the Life of a Reverb Dev: Jeff Barbose

by admin  |  Posted February 13, 2013

Continuing our series on the lives of Reverb developers, today we talk to Jeff Barbose, iOS Senior Engineer. Jeff tells us about his favorite languages, a new favorite beverage, and his traveling strategy.

What’s your favorite coding editor/IDE? Why?

Xcode – it’s the developer tool from Apple, and what you use for iOS and Mac OS X development, and while it’s possible to use external editors with it, and even to go as far as to script it all from the command line, I’m not that kind of geek. Besides, it uses emacs key-bindings for navigation, so what’s not to love?

What’s your beverage of choice?

1. Coffee.
2. Coconut water — it’s a new thing for me.

The best lunch within five blocks of the office?

Rubio’s fish tacos.

What’s your favorite music to listen to while working? And your sound system of choice?

Songwriters, when I do listen to music. Matt Alber is always near the top of my list.

What are your favorite languages?

Objective C — the antidote to C++.
CLOS
Dylan
MACRO-11: VAX/VMS is horrible. Programming in assembler is horrible. But somehow, MACRO-11– the assembler for VAX — is oddly elegant. Go figger.

Which language do you think is terrible?

C++, especially templates.
Java.
Ruby.

Was there only supposed to be one I named? :)

What was your first language? When did you learn it?

BASIC, at home on a TRS-80. Shortly after, Z-80 assembler.

Where do you go for your tech news?

daringfireball, appleinsider, theverge

Where do you go for help?

stackoverflow

What’s the best thing you’ve read about coding lately?

3rd parties using NSIncrementalStore to wrap Core Data models around Web APIs.

What’s the worst thing you’ve read about coding lately?

Apple’s sad documentation on NSIncrementalStore.

What’s your favorite book?

Fiction: You Can’t Go Home Again by Thomas Wolfe.
Nonfiction: Trickster Makes This World by Lewis Hyde.

How about your favorite movie and TV show?

Movies: Broadcast News, Philadelphia Story, Vertigo, Raising Arizona
TV Shows: Currently: “Raising Hope.” Ever: “The West Wing.”

If you had a time machine, what would you go back and tell your younger self?

On 30 Dec 2005, wait until you get to Noe St.

[Editor - Mysterious!]

If you weren’t a dev, what would you be?

An architect.  The traditional kind, as in brick, mortar & habitable spaces.

What do you like to do when you’re not coding?

Read everything while traveling everywhere (whenever I’m not talking to everyone).

What’s your strategy for hitting the bell in Wordnik’s Friday contest?

If I had a strategy or if I had talent at hitting the bell, I’d be Jim Hao.

[Editor - Jim Hao is the bell winningest Reverb dev.]

{ 1 comment }

Computational Linguists: The Heart of What We Do

by Will Fitzgerald  |  Posted February 7, 2013

On the Reverb Developer Blog, we have been running a series called A Day in the Life of a Wordnik Dev, a light-hearted look at the day-to-day life of Reverb developers and researchers: our favorite lunch spots, which side we’re on in the editor wars, our strategy for the weekly office bell-shooting contest, and so on.

But this doesn’t get at the heart of what we do as computational linguists, whether it’s working on recommendation systems for related content or building out the world’s largest dictionary. What does a computational linguist actually do – particularly in the non-academic world?

Of course, the specifics of what a computational linguist does will vary based on the needs of the organization. At Reverb, much of what we do includes exploration of large corpora and text data sets, word graphs, recommendation systems, topic and concept modeling, and personalization.

Recently, I’ve been working to refresh the data on Wordnik.com that we get from a sister project, Wiktionary. Wiktionary’s data is extensive and a useful addition to our site, not only for its definitions, but also for its usage labels and pronunciations. This morning, I was imagining what this task would have been like in the old days of lexicography, when definitions were kept on slips of paper (see below the famous picture of James Murray, first editor of the Oxford English Dictionary, with the definition slips on shelves behind him).

James-Murray

How long would it take Murray to write out Wiktionary by hand? Yet Wiktionary, with its approximately 480 thousand English words (at my last count, 481,304 – computational linguists tend to do a lot of counting), is a modestly sized data source by recent standards.

At Reverb, those of us who wear the computational linguist hat usually do our own programming. We write the Python or Ruby scripts, or Scala code, to convert Wiktionary data to our internal dictionary models. We all feel fortunate that we get to write code, and think of ourselves as coders as well as linguists.

Some of this data from Wiktionary will be added to our word graph – that is, our set of words, data about them, and their interconnections. We recently added a visualization of our word graph to Wordnik.com. Check out the “Wordmap” for craven: it displays, in visual form, its synonyms, word forms, and rhymes, as well as words that appear in the same context.

It would be nice if we – and Wiktionary – also listed courageous as an antonym; it’s part of our job to find these relationships computationally and add them. Here’s a thought experiment: how would you write a program to find antonyms semi- or completely automatically?

Turning to our related content recommendation systems, such as Reverb for Publishers, we believe at the core of a good recommendation is the content of what someone is reading now, and the content of articles that they might be reading. This may seem like motherhood and apple pie, but many systems don’t use much content, and doing so is more difficult than you might think.

As computational linguists, we think hard about what content is actually important in making recommendations, and create systems to extract that content, make comparisons, and run experiments to see if our hypotheses are correct. We spend a fair amount of our time exploring and evaluating systems for content extraction (such as the Stanford NLP pipeline), and then integrating them into our experimental, staging, and production systems.

For example, we have been exploring the use of topic models for making content recommendations. There’s more than a little bit of art in setting up topic models, and we as computational linguists must tune models to fit the task as hand. How many topics are useful? Which values for tuning parameters should we use? What data should we train on? Which topic modeling system should we use? The number of choices is immense, and we can’t empirically test them all, so making good guesses is part of the job.

We have also been looking at how we can use concepts to make good recommendations. It’s easy to over-philosophize the meaning of meaning, but we believe we can make better recommendations if we have a better understanding of what the words in a document mean, rather than just their surface forms. Some of this is picking out the names mentioned in the texts (especially people, locations, and organizations), but we’d also like to know if someone is writing about class, whether they mean the sociological sense (the middle class), the computer science sense (the Adjective class inherits from the Word class), the school sense (I took a class in computational linguistics), or the way that our founder dresses (Hey, Erin, that dress with the eyeglass print has real class).

Finally, we are very interested in making our recommendations a personal experience. There is an adage, attributed to Firth, that a word is known by the company it keeps, and we believe that readers can be known by the company of words they keep. If you like to read articles on the business of sports, we want to tend to show you those articles (rather than, for example, the business of pets). And so we use similar techniques, such as topic modeling, name and concept extraction, and other tools in our NLP tool chests, to make these recommendations.

The life of a computational linguist is varied and interesting, working with large amounts of textual (and, increasingly, non-textual) data, writing code, running experiments, reviewing scientific literature and relevant software systems, and interacting with the other developers and project managers. We even get to write an occasional essay.

If you have a strong computational linguistics or machine learning background, and are looking for a position, take a look at our jobs page, or feel free to contact me directly (will@helloreverb.com). Also let us know if you have any further questions about what computational linguists do. We’d love to hear from you.

{ 1 comment }

ReverbSignature-283x300

Welcome from the Reverb Engineering Team!

Who’s Reverb, you may be asking? It’s a new company that the founders of Wordnik have launched, and you’ll be seeing all sorts of news about new and improved tools and frameworks for developers here, from the Wordnik API to Swagger, Atmosphere and most recently Scalatra, all under the umbrella of Reverb for Developers.

Reverb builds tools and software to enable developers to push the limits on the web, across your infrastructure, and of course on mobile. Follow @ReverbForDevs (formerly @wordnikapi) for updates on our tools.

If you’re already a user of our tools, what does this mean to you? Here’s the breakdown.

Wordnik API

For dictionary and word information, the Wordnik API provides a gorgeous 15,000 requests per hour for free. It will continue to do so–and evolve with new features. If you’re using the Wordnik API, you don’t need to change a thing. Discuss the Wordnik API on our Google group, or on irc.freenode.net, #wordnik. If you use Wordnik in a commercial application, our attribution requirements are the same.

Swagger

Swagger will remain in the Wordnik GitHub repository but its homepage is now http://developers.helloreverb.com/swagger/. Swagger is now by Reverb, and is still Apache 2.0 license. Join the Swagger conversation on irc.freenode.net #swagger, or on our Google groups.

Atmosphere

Wordnik fully sponsored the development and Reverb will continue to do the same. Atmosphere has a page at http://developers.helloreverb.com/atmosphere and lives in the same GitHub repository as before, under https://github.com/atmosphere/Atmosphere. Atmosphere will also continue with its original Apache 2.0 license.

Scalatra

Scalatra is now sponsored by Reverb, and is driving integration between its beautiful and concise Scala DSL, Atmosphere, and Swagger. Look to great things from this tiny yet powerful framework.

So what’s changed, other than the sponsor? Our commitment to the developer community is stronger than ever and we’re proud of of the trust that our developers have in us. Look for tight integration across these products in the near future, as well as new tools which leverage the Reverb infrastructure. Thank you for your support, trust, and feedback.

The Reverb Engineering Team

{ 1 comment }

Filed under: reverb | Permalink | 1 Comment

A Day in the Life of a Wordnik Dev: Russell Horton

by admin  |  Posted January 8, 2013

Continuing our series on the lives of Wordnik developers, today we talk to Russell Horton, aka @ngr_am, Computational Linguist. Russ lets us in on his favorite tools, languages, and a spicy hobby.

What’s your favorite coding editor/IDE? Why?

Sublime Text 2, because of batch edits, ⌘-P, and the package ecosystem.

What’s your beverage of choice?

Green tea.

The best lunch within five blocks of the office?

Hella Vegan burrito at Curry Up Now.

What’s your favorite music to listen to while working? And your sound system of choice?

Depending on the task, I like to listen to electronica, hip-hop or NPR, on the Cambridge Model Twelve.

What are your favorite languages?

I love English. Scala and Python are pretty nice, too.

Which language do you think is terrible?

I’m not much of a language snob. I even kinda miss Perl. But this just ain’t right:

<?php
if ("almond milk lattes" == 0) {
echo "WTHF?";
}
?>

What was your first language? When did you learn it?

Basic, in 1987, on our Gateway 386dx 25MHZ (with Turbo!). Thanks Mom and Dad!

Where do you go for your tech news?

Twitter, the folks in the office.

Where do you go for help?

The folks in the office, StackOverflow, IRC.

What’s the best thing you’ve read about coding lately?

The Dispatch documentation is pleasant.

What’s the worst thing you’ve read about coding lately?

I try not to read awful things.

What’s your favorite book?

I don’t have favorites, but I recently read The Moon is a Harsh Mistress by Robert A. Heinlein and really enjoyed it.

If you had a time machine, what would you go back and tell your younger self?

Buy Apple stock when I was advised to in 1998. To be fair, I was doing support on LCIIs, Performas and Quadras at the time, so it didn’t seem too obvious.

If you weren’t a dev, what would you be?

If I weren’t a computational linguist, I would be a non-computational linguist.

What do you like to do when you’re not coding?

Grow and eat the world’s hottest peppers.

What’s your strategy for hitting the bell in Wordnik’s Friday contest?

Make the plane of your body parallel to the flight path of the dart. Extend your arm as straight and as far as possible, and sight along the barrel. Have a light grip, and squeeze the trigger with the tip of your index finger. Disclosure: I have never won. But I did stick a suction dart to the bell, twice.

{ 0 comments }