The Language Detection model can process text with a length up to 5120 characters. It would come in handy if this would be increased.
Hi @XavierV ,
Thanks for reaching out on that.
Do you have an example on how this increase could help you?
It's likely that the Language detection model would give an accurate result if you truncate 5k characters of a text.
Hi @Antrod ,
It is indeed likely that the model returns a result which is accurate enough with 5k characters.
However, when for example analyzing multilingual emails, 5k characters has proven to be insufficient to decently detect all languages present. Now the body of the email (after being stripped from HTML) must be splitted into pieces to analyze and this could result in faulty results.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.