![]() ![]() A little while ago Alexandre Airvault was explaining how to use #ChatGPT's API plugin model and that currently this is just based on the textual descriptions of the API. It seems that there could be a lot of potential in truly understanding the richness of the OpenAPI model. It's all about learning from examples and looking at documentation (if I understand correctly). What's surprising to me is that even this still doesn't seen to take the structural richness of #OpenAPI into account. □ Gorilla: Large Language Model Connected with Massive APIs. It looks at how to improve the way how #LLM-type applications can better use #API invocations. This is fascinating work by Shishir Patil and others from the University of California, Berkeley. #apieconomy #apistrategy #voice #voicerecognition #html5 #webapp #dependency #dependencymanagement #gettingapistowork Mostly, this is just an interesting anecdote about the possibilities of today's API landscape, about the sometimes hidden dependencies, and about some of the limitations you first have to fully understand and then even may be incapable of fixing. I'll give this one a try and see whether this one has some built-in limitations as well. ![]() The other possibility is to move over to Apple's Safari which then will use an Apple API instead of a Google API. So you're running into a limit that's a bit tricky to understand and impossible to fix. The tricky part is that because it's not your app using the Speech-to-Text API but the Chrome browser, you cannot upgrade the API usage and move over to a paid plan where there is no built-in limit. It took me a while to figure this out when I wanted to use a web app with speech recognition. That's because the web app is indirectly using Google's Speech API which stops working after 10MB of audio. You will hit that limit relatively quickly because it's 10MB and that's not very much when you are streaming audio.Īs a user, you're using a web app using Chrome (not all browsers support the Speech API), things are working, and suddenly speech recognition stops. But one thing that's hard to manage is that because you're hitting Google's API as a free service, there's a limit on how much data you can transmit. Privacy issues aside, this may be ok with you as long as it works. This means that when a web app ask a Chrome browser to do speech recognition, it's actually Google's API doing the work in the background. In the case of the Google Chrome desktop browser, the implementation of the Web Speech API is just a thin wrapper around Google's Speech-to-Text API ( ). It's a powerful feature and opens the door to many new use cases.īut sometimes details can get in the way of the Wonderful World Wide Web. The Web Speech API ( ) supports speech recognition inside the browser, allowing web apps to use speech recognition (and synthesis). Sometimes this can also have unintended side-effects, such as in the case of the W3C's Web Speech #API. Modern browsers are magical! HTML5 has added many #APIs to browsers and it's hard to keep track of what the Web platform can do nowadays. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |