Sentiment API för svenska

Många av våra klassificerare finns nu tillgängliga på flera språk. En av de populäraste som nu även finns på svenska är Sentiment. Den avgör om en text är positiv eller negativ genom att analysera språkbruket.

För ämneskategorisering finns även IAB taxonomy v2 och samtliga Topics klassificerare på flera språk inklusive svenska.

Det är gratis att använda vårt API upp till 500 anrop/dag, efter det finns det olika kostnadsnivåer från 9€ per månad (5000 anrop/dag).

This post announces that many of our classifiers are available in Swedish.

Happy new 2018

Here is a short summary of 2017 and some glimpses into 2018.

Last year was a good year for uClassify. The main theme was to offer classifiers in multiple languages (English, Spanish, French and Swedish). The task was non trivial and we decided to keep it in ‘beta’ for a long to make sure it works and scales as intended. Now we feel confident to move out of beta and start to promote the service.

We created a few new classifiers for our users, the most popular are the IAB Taxonomy V2 and Language Detection classifiers (I am particularly proud of its capability to detect 370 different languages!) .

For the second half of 2017 I went on parental leave, during this time I mostly monitored uClassify, answered emails and pushed a few fixes.

As a hobby project I created a site with tons of generated number sequences,, if you are into that kind of thing.

Thoughts about 2018

In the beginning of 2018 we will add more classifiers in different languages and move out of beta and do some promoting.

As for the next big features we are not entirely sure, there is a big request for URL batching, for different reasons we’ve been dodging this in the past, but it deserves a reconsideration.

During parental leave I played a lot with numeric, images and time series classification (as opposed to text). This is something I’m thinking of might find it’s way into the platform, although not sure in what form.

Another thing we should do is to publish api clients in different languages (Java, Python, C# etc).

During the coming month (my last month on parental leave) I’ll start with some of the tasks and set a plan for the rest of the year.

Happy new years everyone!


Discourse classifier

We have added a new classifier that can determine the discourse of a text. It can for example distinguish questions from answers, if the answer is an agreement or disagreement. It even tries to see if there is humor in the text. The classes are listed below.

  • Agreement
  • Announcement
  • Answer
  • Appreciation
  • Disagreement
  • Elaboration
  • Humor
  • Negative_reaction
  • Other
  • Question



Since long texts often has mixed discourse, containing questions, answers, elaborations, humor an so on – it may make sense to pass single sentences or phrases for classification (split the text).

It’s based on the dataset from the paper “Characterizing Online Discussion Using Coarse Discourse Sequences (ICWSM ’17)” The dataset is built from annotated reddit comments.

Spanish, French and Swedish classifier languages

During the last half year the Sentiment classifier have been beta enabled for Spanish, French and Swedish. The test period has been very successful and we have decided to expand multi language support to more popular classifiers such as the Gender Analyzer, Mood and Myer Briggs classifiers.

Classifiers with multiple languages are have flags displayed like the icons above. From the GUI you can test them by clicking the flag first, from the API you simply add the language code (/es, /fr, /sv) to the request URL, for more information see the documentation.

The service is still in beta, as we still need to make sure it scales when more users start to use it. The API will probably not change.

New xml text element

Our XML API has been around since the release of uClassify back in 2008. It’s very flexible and powerful. Previously, to avoid breaking the XML all texts passed needed to be base64 encoded in the <textBase64> element. With this release we introduce the element <text> that doesn’t require base64 encoding. The <textBase64> is of course still supported.

The new <text> element can take plain text. This saves some bandwidth, performance and makes it easier to use. The string needs to be XML encoded so it doesn’t break the XML. Most languages have support functions for this, look for “escape XML” or similar. Basically it replaces 5 characters (<,>,&,’ and “) with their encoding (&lt; etc.).

<text>I love new features &amp; would like to see more in the future</text>


The new <text> element makes the implementation of our next big feature easier… 😉

IAB Taxonomy V2

The Interactive Advertising Bureau (IAB) has released a version 2 of their taxonomy as of the first of March 2017. The new taxonomy contains more topics than the old and has gone through a general overhaul to make it more clear.

We have build a new classifier, IAB Taxonomy V2, that conforms with the latest standard.

The new ‘Content’ category has been left out but you can get content language by calling our Language Detector.

Any feedback is appreciated and we may add more training data if necessary.

Class name format

The new taxonomy has up to 4 tiers this is reflected in the class names. The format of the class names is level1_leaf_id1_id2_id3_id4 the ids correspond to the IAB codes and are integers.

You can read more about the taxonomy at their homepage where you also can find the complete id mapping.

Classifier accuracy improvements

We have updated some of our most popular classifiers to give better results.

Our most popular classifier for sentiment has been updated to give better performance. The major difference is that the data has gone through a cleaning pass, removing non english texts (noise). And with a slightly improved feature extractor and optimized data we can expect better accuracy.

We’ve also updated the following popular classifiers to use a new feature extractor. The result is better accuracy.

We might update more classifiers in the future.

Language Detector for +370 major and rare languages

We have constructed a language detector consisting of about 374 languages.

It can detect both living and extinct languages (e.g. English and Tupi), identify ancient and constructed (e.g. Latin and Klingon) and even different dialects.

Each language class has been named with its English name followed by an underscore and the corresponding ISO 639-3 three letter code. E.g.

  • Swedish_swe
  • English_eng
  • Chinese_zho
  • Mesopotamian Arabic_acm

You can try it here, it needs a few words to make accurate detections.

Some of the rare languages (about 30) may have insufficient training data. The idea is to improve the classifier as more documents are gathered. Also we may add more languages in the future, so make sure your code can handle that.

Here is the full list of supported languages

Language Name ISO 639-3 Type
Abkhazian abk living
Achinese ace living
Adyghe ady living
Afrihili afh constructed
Afrikaans afr living
Ainu ain living
Akan aka living
Albanian sqi living
Algerian Arabic arq living
Amharic amh living
Ancient Greek grc historical
Arabic ara living
Aragonese arg living
Armenian hye living
Arpitan frp living
Assamese asm living
Assyrian Neo-Aramaic aii living
Asturian ast living
Avaric ava living
Awadhi awa living
Aymara aym living
Azerbaijani aze living
Balinese ban living
Bambara bam living
Banjar bjn living
Bashkir bak living
Basque eus living
Bavarian bar living
Baybayanon bvy living
Belarusian bel living
Bengali ben living
Berber ber living
Bhojpuri bho living
Bishnupriya bpy living
Bislama bis living
Bodo brx living
Bosnian bos living
Breton bre living
Bulgarian bul living
Buriat bua living
Burmese mya living
Catalan cat living
Cebuano ceb living
Central Bikol bcl living
Central Huasteca Nahuatl nch living
Central Khmer khm living
Central Kurdish ckb living
Central Mnong cmo living
Chamorro cha living
Chavacano cbk living
Chechen che living
Cherokee chr living
Chinese zho living
Choctaw cho living
Chukot ckt living
Church Slavic chu ancient
Chuvash chv living
Coastal Kadazan kzj living
Cornish cor living
Corsican cos living
Cree cre living
Crimean Tatar crh living
Croatian hrv living
Cuyonon cyo living
Czech ces living
Danish dan living
Dhivehi div living
Dimli diq living
Dungan dng living
Dutch nld living
Dutton World Speedwords dws constructed
Dzongkha dzo living
Eastern Mari mhr living
Egyptian Arabic arz living
Emilian egl living
English eng living
Erzya myv living
Esperanto epo constructed
Estonian est living
Ewe ewe living
Extremaduran ext living
Faroese fao living
Fiji Hindi hif living
Finnish fin living
French fra living
Friulian fur living
Fulah ful living
Gagauz gag living
Galician glg living
Gan Chinese gan living
Ganda lug living
Garhwali gbm living
Georgian kat living
German deu living
Gilaki glk living
Gilbertese gil living
Goan Konkani gom living
Gothic got ancient
Guarani grn living
Guerrero Nahuatl ngu living
Gujarati guj living
Gulf Arabic afb living
Haitian hat living
Hakka Chinese hak living
Hausa hau living
Hawaiian haw living
Hebrew heb living
Hiligaynon hil living
Hindi hin living
Hmong Daw mww living
Hmong Njua hnj living
Ho hoc living
Hungarian hun living
Iban iba living
Icelandic isl living
Ido ido constructed
Igbo ibo living
Iloko ilo living
Indonesian ind living
Ingrian izh living
Interlingua ina constructed
Interlingue ile constructed
Iranian Persian pes living
Irish gle living
Italian ita living
Jamaican Creole English jam living
Japanese jpn living
Javanese jav living
Jinyu Chinese cjy living
Judeo-Tat jdt living
K’iche’ quc living
Kabardian kbd living
Kabyle kab living
Kadazan Dusun dtp living
Kalaallisut kal living
Kalmyk xal living
Kamba kam living
Kannada kan living
Kara-Kalpak kaa living
Karachay-Balkar krc living
Karelian krl living
Kashmiri kas living
Kashubian csb living
Kazakh kaz living
Kekchķ kek living
Keningau Murut kxi living
Khakas kjh living
Khasi kha living
Kinyarwanda kin living
Kirghiz kir living
Klingon tlh constructed
Kölsch ksh living
Komi kom living
Komi-Permyak koi living
Komi-Zyrian kpv living
Kongo kon living
Korean kor living
Kotava avk constructed
Kumyk kum living
Kurdish kur living
Ladin lld living
Ladino lad living
Lakota lkt living
Lao lao living
Latgalian ltg living
Latin lat ancient
Latvian lav living
Laz lzz living
Lezghian lez living
Lįadan ldn constructed
Ligurian lij living
Lingala lin living
Lingua Franca Nova lfn constructed
Literary Chinese lzh historical
Lithuanian lit living
Liv liv living
Livvi olo living
Lojban jbo constructed
Lombard lmo living
Louisiana Creole lou living
Low German nds living
Lower Sorbian dsb living
Luxembourgish ltz living
Macedonian mkd living
Madurese mad living
Maithili mai living
Malagasy mlg living
Malay zlm living
Malay msa living
Malayalam mal living
Maltese mlt living
Mambae mgm living
Mandarin Chinese cmn living
Manx glv living
Maori mri living
Marathi mar living
Marshallese mah living
Mazanderani mzn living
Mesopotamian Arabic acm living
Mi’kmaq mic living
Middle English enm historical
Middle French frm historical
Min Nan Chinese nan living
Minangkabau min living
Mingrelian xmf living
Mirandese mwl living
Modern Greek ell living
Mohawk moh living
Moksha mdf living
Mon mnw living
Mongolian mon living
Morisyen mfe living
Moroccan Arabic ary living
Na nbt living
Narom nrm living
Nauru nau living
Navajo nav living
Neapolitan nap living
Nepali npi living
Nepali nep living
Newari new living
Ngeq ngt living
Nigerian Fulfulde fuv living
Niuean niu living
Nogai nog living
North Levantine Arabic apc living
North Moluccan Malay max living
Northern Frisian frr living
Northern Luri lrc living
Northern Sami sme living
Norwegian nor living
Norwegian Bokmål nob living
Norwegian Nynorsk nno living
Novial nov constructed
Nyanja nya living
Occitan oci living
Official Aramaic arc ancient
Ojibwa oji living
Old Aramaic oar ancient
Old English ang historical
Old Norse non historical
Old Russian orv historical
Old Saxon osx historical
Oriya ori living
Orizaba Nahuatl nlv living
Oromo orm living
Ossetian oss living
Ottoman Turkish ota historical
Palauan pau living
Pampanga pam living
Pangasinan pag living
Panjabi pan living
Papiamento pap living
Pedi nso living
Pennsylvania German pdc living
Persian fas living
Pfaelzisch pfl living
Picard pcd living
Piemontese pms living
Pipil ppl living
Pitcairn-Norfolk pih living
Polish pol living
Pontic pnt living
Portuguese por living
Prussian prg living
Pulaar fuc living
Pushto pus living
Quechua que living
Quenya qya constructed
Romanian ron living
Romansh roh living
Romany rom living
Rundi run living
Russia Buriat bxr living
Russian rus living
Rusyn rue living
Samoan smo living
Samogitian sgs living
Sango sag living
Sanskrit san ancient
Sardinian srd living
Saterfriesisch stq living
Scots sco living
Scottish Gaelic gla living
Serbian srp living
Serbo-Croatian hbs living
Seselwa Creole French crs living
Shona sna living
Shuswap shs living
Sicilian scn living
Silesian szl living
Sindarin sjn constructed
Sindhi snd living
Sinhala sin living
Slovak slk living
Slovenian slv living
Somali som living
South Azerbaijani azb living
Southern Sami sma living
Southern Sotho sot living
Spanish spa living
Sranan Tongo srn living
Standard Latvian lvs living
Standard Malay zsm living
Sumerian sux ancient
Sundanese sun living
Swabian swg living
Swahili swa living
Swahili swh living
Swati ssw living
Swedish swe living
Swiss German gsw living
Tagal Murut mvv living
Tagalog tgl living
Tahitian tah living
Tajik tgk living
Talossan tzl constructed
Talysh tly living
Tamil tam living
Tarifit rif living
Tase Naga nst living
Tatar tat living
Telugu tel living
Temuan tmw living
Tetum tet living
Thai tha living
Tibetan bod living
Tigrinya tir living
Tok Pisin tpi living
Tokelau tkl living
Tonga ton living
Tosk Albanian als living
Tsonga tso living
Tswana tsn living
Tulu tcy living
Tupķ tpw extinct
Turkish tur living
Turkmen tuk living
Tuvalu tvl living
Tuvinian tyv living
Udmurt udm living
Uighur uig living
Ukrainian ukr living
Umbundu umb living
Upper Sorbian hsb living
Urdu urd living
Urhobo urh living
Uzbek uzb living
Venda ven living
Venetian vec living
Veps vep living
Vietnamese vie living
Vlaams vls living
Vlax Romani rmy living
Volapük vol constructed
Võro vro living
Walloon wln living
Waray war living
Welsh cym living
Western Frisian fry living
Western Mari mrj living
Western Panjabi pnb living
Wolof wol living
Wu Chinese wuu living
Xhosa xho living
Xiang Chinese hsn living
Yakut sah living
Yiddish yid living
Yoruba yor living
Yue Chinese yue living
Zaza zza living
Zeeuws zea living
Zhuang zha living
Zulu zul living


The classifier has been trained by reading texts in many different languages. Finding high quality, non noisy texts is really difficult. Many thanks to

  1. Wikipedia that exists in so many languages
  2. Tatoeba which is a great resources for clean sentences in many languages

New account limits

We have updated the daily quota limits for the different accounts. If you already (before 3rd of March 2017) are subscribing on an account this will not affect you.

The adjustments are made after looking at the statistics of our users. Before the Indie and Professional account had the same rate limits but at different prices. The new model is a ladder from Indie to Professional accounts, with growing discount the higher you get.

This also affected our translation api, which now gives you 40 characters per call instead of 10.

The new pricing can be found here. We will see how it works out the coming weeks, we might have to make some adjustments.