Hey there, well, it depends on how much you like to play around with things. As for local options, there is the im2txt model: https://github.com/HughKu/Im2txt It's quite interesting, though the code is quite out of date (still for tensor flow 1.x, the modern version is 2.x, which is not backward compatible) Thus some fiddling will be necessary to get it working. Google's Inception is also an option: https://github.com/tensorflow/models This repository currently doesn't contain it, but it definitely did in the past, so if you can work with git, you can jump few years back to find it. I'm not sure if they updated it for TF 2 or not, so again some playing may be necessary to get the right environment. There is in fact a newer version of Inception, v4, but I did not test thatone myself and don't know if there are some simple to use applications for their usage. Also, as for the difference between Im2txt and Inception, Im2txt describes whole scenes like the exampleone - A man surfing a wave, while Inception just recognizes objects (man, surf, wave perhaps), and gives you information on how sure it is the item is there. There are also online systems, if you don't mind sharing your photos with third parties. Probably the bestone I've seen so far is Cloudsight: https://cloudsight.ai/ The same service used for Taptapsee or Camfind. Their descriptions are very accurate, they're using machine learning combined with human oversight, so even though the recognition used to take about 15 - 20 seconds (I don't know about the current state), the results were usually worth it. Now, they have even video recognition and offline objects detection, though I don't know how accurate are those. Cloudsight has a public api, which even provides free recognition upto some point. You can try it on their website and could perhaps make a script for it, so you can access the service easily from command line. May be there are already some on GitHub, that might be worth checking as well if you're not into programming. Best regards Rastislav Dňa 26. 5. 2021 o 23:47 Linux for blind general discussion napísal(a): > Okay, I'm aware of Tesseract and cuneiform for doing OCR on image > files, but I was wondering if anyone on this list knew of any > command-line utilities that might be able to tell me useful things > about the contents of images that contain no text. Even something as > simple as printing the image's palette in descending order of > abundance or recognition of basic geometric shapes would be useful I > think. > > My primary use case is giving meaningful filenames to digital photos > where I know what photos are in the set, but not which photo is which, > and primarily, the photos are of crafts I've made and taken with the > camera my portable mediaplayer/talking eReader uses for OCRing print > documents(the device gives the photos very long, numeric filenames > that might be timestamps, but even that isn't of much use if I take > more than one photo in a round of blind photography and transferring > photos to my Desktop, especially since the device's clock resets to > midnight the moringing of January 1, 2014 whenever the battery is > pulled out). > > I've tried googling and searching the package lists in Aptitude, but > all I've managed to find are libaries for writing computer vision code > into reobotics projects or cloud-based complex object AI stuff. > > _______________________________________________ > Blinux-list mailing list > Blinux-list@xxxxxxxxxx > https://listman.redhat.com/mailman/listinfo/blinux-list > _______________________________________________ Blinux-list mailing list Blinux-list@xxxxxxxxxx https://listman.redhat.com/mailman/listinfo/blinux-list