New Computer Vision Model (v2.13) with 1,656 new taxa

We released a new computer vision model today. It has 88,517 taxa up from 86,861. This new model (v2.13) was trained on data exported on March 31, 2024.

Here's a graph of the models release schedule since early 2022 (segments extend from data export date to model release date) and how the number of species included in each model has increased over time.

Our goal is to try to attain the same accuracy or improve it while adding more taxa to the model. The graph below shows model accuracy estimates using 1,000 random Research Grade observations in each group not seen during training time. The paired bars below compare average accuracy of model 2.12 with the new model 2.13. Each bar shows the accuracy from Computer Vision alone (dark green) and Computer Vision + Geo (green). Overall the average accuracy of 2.13 is 88.2% (statistically the same as 2.12 at 88.1% - as described here we probably expect ~2% variance all other things being equal among experiments).

Here is a sample of new species added to v2.13:

Posted on May 16, 2024 09:47 PM by

loarie

Comments

Wonderful!! Thank you for more amazing updates to the computer vision.

Posted by pinefrog 5 months ago

"Computer Vision + Geo (geen)." ...I assume you mean "green" -- or "light green".

Posted by astra_the_dragon 5 months ago

I see many new Chironomid species added. Thanks to all members of the 'non-biting task force' 🤗

Posted by carnifex 5 months ago

Is there any way to check if any of my unidentified observations can be identified using this new computer model?

Posted by stariplativky 5 months ago

That the Geo adds so little to the result is a bit hard to believe are orignal values as day of month of observation (+- 45days) not used anymore

Posted by ahospers 5 months ago

Thanks for the fantastic work and for the transparency with which you release data and results on each modeling iteration. Out of curiosity, is there any promising general-purpose foundational model (not trained specifically on iNat data) that achieves similar performance?

Posted by radrat 5 months ago

Do we have a sense of how many species we can cram into this model?

Posted by mmulqueen 5 months ago

Great work folks!

Posted by susanhewitt 5 months ago

I've noticed that since the new model is out, there are suddenly a lot more Mercenaria mercenaria being CV-identified as Rangia cuneata in the mid-Atlantic coast of the USA. Do these models take feedback?

Posted by amr_mn 5 months ago

Nice, some of the spiders I have been working on made it to the new model. Thanks for the regular update 👍

Posted by ajott 5 months ago

Is there any way to request species be added to the CV? There's a particular recently described species that is very region-specific that could benefit from CV in a very bad way because CV is still erroneously suggesting the European species name for North American observations, despite the proper species already having 400+ RG observations.

Posted by lothlin 5 months ago

@lothlin I suspect the species is already included in the model if it has 400+ RG obs. What is it?

Posted by loarie 5 months ago

Macrolepiota macilenta. It is not (as far as I can tell) - the paper was only published recently and I checked and the species wasn't added in this most recent update.

Looks like Macrolepiota macilenta was created on April 26 of this year so it wouldn't be in v2.13 which was based on a data export from March 31, 2024. It will be included in the next model, v2.14, which is based on a data export made on May 12, 2024.

How many fungi species is there in total please?

Posted by jonasgruska 5 months ago

v2.13 includes 3,530 fungi taxa

Awesome, thanks Loarie!

Question, Fragaria × ananassa was originally only in there because it was ranked as a species, not a hybrid. It has just been changed to a hybrid, so is it no longer in the CV?

Posted by leytonjfreid 5 months ago

correct - hybrids are not included in the model taxonomy

What happened with the cropping change (the light green) that was mentioned in previous update?

https://static.inaturalist.org/wiki_page_attachments/3853-original.png

Posted by rudolphous 5 months ago

What is cropping change ? And it it not possible to add model 2.10 ad 2.11 into the same evaluation with the same 1000 photos ?https://www.inaturalist.org/blog/91824

The last...2% is rather much (The last 2% is the hardest part)
. "Cropping Change" is a slight modification to the way images are prepared before they are sent to the CV model that resulted in an average 2.1% improvement.

The cropping change resulted from some method improvements to how images are processed before they are sent to the computer vision model that we made between v2.11 and v2.12. We didn't have capacity to make additional method improvements between v2.12 to v2.13, but that cropping improvement is still in place and is reflected in the accuracy of v2.13 (compared to what it would have been had we not made those changes).

The cropping change steps from the fact that the computer vision model needs to examine a square image, which means when dealing with non-square images we have options like squeezing and clipping. Based on some experiments, we made some small changes to this processing pipeline which yielded the ~2% improvement.

We agree there would be certain advantages to using the same 1000 photos to evaluate all models, but we're not currently doing that because of the complexity involved in holding out that test set given the dynamic nature of iNat. There's also significant taxonomic drift between data/model versions which add complexity and is why we're currently focused on just comparing 2 models (the previous model to the new model) rather than trying to track improvements across multiple models - though we agree that would be ideal.

Thanks for the clear explanation.

Just curious. Is the next update of the computer vision model a bit delayed?

Posted by rudolphous 4 months ago

Yes - there was an issue with v2.14 which delayed it - it should be ready soon

Posted by loarie 3 months ago

Great you fixed the issue 👍👍👍

Posted by rudolphous 3 months ago

2 14 getting closer ?

Posted by dianastuder 3 months ago

yes - its ready and should be released this week - apologies for the delay

New Computer Vision Model (v2.13) with 1,656 new taxa

Comments

Add a Comment