Thursday, June 20, 2024

Calculating tokens for words

For LLM applications, we often use embedding models like ada-002 or davinci models. While using these models, we need to often estimate the number of tokens that would required for our application. 

For the English language, a good thumb rule is that 3 to 4 chars make up a token. 

A nifty online tool that can help you estimate the number of tokens is: https://www.quizrise.com/token-counter


Wednesday, June 19, 2024

Database Manager - DBGate

 If you are looking for a web-based database manager that is open-source and commercial friendly, then please have a look at DBGate: https://github.com/dbgate/dbgate

Since it is web-based, you can see a good demo here: https://demo.dbgate.org/


Tuesday, June 11, 2024

#no-build for JS: Embracing the Era of Import Maps and HTTP/2

Complex build processes have dominated the web development world for years. JavaScript module bundling, transpiling, and management have become critical tasks for tools such as Webpack. However, what if there was an easier method? 

It is time for us to "un-learn" old tricks and rid ourselves of the baggage of old knowledge! Modern browsers support HTTP/2 and "import maps" for javascript modules and this is an absolute game changer!

In the past, web browsers had trouble comprehending contemporary JavaScript features and had trouble loading many small files quickly. Webpack and other build tools addressed these problems by:

  • Bundling: Putting several JavaScript files together into a single, bigger file to cut down on HTTP requests.
  • Transpiling:transforming current JavaScript syntax into an earlier iteration that works with an earlier browser.

Although these technologies were helpful to us, they increased complexity:
  • Build Configuration: Build procedures frequently cause disruptions to the development workflow, necessitating continuous rebuilding and waiting. 
  • Development Workflow Disruption: Build configurations can be time-consuming to set up and manage.

But all modern browsers (Chrome, Firefox, Safari, Edge) support HTTP/2 and "Import Maps" for JS modules. 
  • Import maps enables us to use alias for full JS module paths - it is a method of creating a central registry that associates module names with their real locations is. This makes things clearer and more versatile by removing the requirement for you to write whole file paths in your code (across multiple files). 
  • HTTP/2 is a more effective and rapid method of sending data across the internet. It makes it unnecessary to bundle files because browsers can handle several tiny files well. Instead of opening many connections to the server (like having many waiters running around), HTTP/2 uses one connection. Thus, multiple JS files can be downloaded concurrently, speeding up page load times.

Friday, June 07, 2024

Should we serve JS files from a CDN?

An excellent article that states why we need to actually use a CDN for serve popular JS libraries:  https://blog.wesleyac.com/posts/why-not-javascript-cdn

Excerpt from the article: 

"This means that one of these CDNs going down, or an attacker hacking one of them would have a huge impact all over society — we already see this category of problem with large swaths of the internet going down every time cloudflare or AWS has an outage."