OpenAI discovers way to cut inference costs in half – report


The Information is out with a report saying that OpenAI has found a way to cut inference costs in half.

I haven’t seen the report myself but this is the kind of thing — if proven — that would hurt demand for memory chips. Micron is up 0.5% today while Sandisk is up 5% on an upgrade from Bernstein.

The SOXX semiconductor ETF is up 3% today and has been hot this year, to say the least.

The report says:

In one previously unreported example, OpenAI engineers earlier this month told some colleagues they had figured out a way to more than halve the cost of inference, or running existing models, thanks to some newly-discovered optimizations, according to a person with knowledge of those discussions.

When the engineers applied the new techniques to power ChatGPT for visitors who didn’t have a free or paid account, it reduced the number of Nvidia graphics processing units needed at one point to just a couple hundred—a shockingly small number. (That said, OpenAI likely doesn’t get much ChatGPT usage from such users, as the company limits how much they can use the chatbot that way.)

This article was written by Adam Button at investinglive.com.



Source link

Categories:

Leave a Reply

Your email address will not be published. Required fields are marked *

Update cookies preferences