Keyword Extraction

A new keywords API was released a few weeks ago. The old one was not really well designed and needed a revamp.

With the keywords API you can extract keywords from texts with respect to a classifier, for example, if you want to find words that make a text positive or negative you extract keywords with the sentiment classifier  or if you want to generate tags for a blog post based on topic, you can run it through a topics classifier or maybe our new IAB Taxonomy classifier.

The result will be a lists of keywords where each keywords is associated with one of the classes. Also each keywords has a probablility, indicating how important each keyword is, a weight if you will. A high value (max 1) means the keyword is very important/relevant.

Example result when extracting text from the sentiment classifier:

[
  [
    {
      "className": "positive",
      "p": 0.698862,
      "keyword": "happy"
    },
    {
      "className": "negative",
      "p": 0.831895,
      "keyword": "worse"
    },
    {
      "className": "negative",
      "p": 0.736696,
      "keyword": "bad"
    },
    {
      "className": "negative",
      "p": 0.914509,
      "keyword": "stinks"
    }
  ]
]

You can use the extracted keywords together with their probability to create word clouds just like I did when I investigated the Bechtel test.

passed_mad_max_unigrams
These are keywords extracted and indicating a passed Bechdel test. Can you guess which movie?

Here is the keywords documentation.

Latest version

Machine Learning is certainly picking up, we are getting a lot more users and requests and we are really excited by this. So we started 2016 by making an maintenance update, mostly fixes:

  • Fixed broken xml schema links
  • Fixed keyword extraction for classifiers using other than uni-grams (e.g. sentiment classifier)
  • Fixed Twitter external login
  • Updated backend libs to latest
  • Added service terms
  • Pricing adjustments the indie local server 99EUR->299EUR, Enterprise 5999EUR->3999EUR

Right now we are working on the much wished-for Json Api that will be in the next major release.

uClassify is updating

The old uClassify site has been set to read-only and the database & classifier migration has been done. Now we are just waiting for the DNS to propagate over the nets before the new site can be taken into use. This time on an elastic IP so hopefully this we won’t have to do anymore of those ‘waiting’ operations in the future.

Hopefully it has been fully propagated within 24h.

Let me know if you have any trouble with your account.