Eckart Bindewald Posted June 10 Share Posted June 10 Just posted a short LinkedIn article debunking some claims that "super-tiny" language models have "comparable" performance with respect to much larger models. Turns out if you actually look in the paper the performance of the super-tiny model is at or below random guess probability for several datasets.... First I thought this must be the secondary reporting, but now I am looking more at the original authors who published a pre-print with weak results and contradicting strong claims. What is the process of asking for a revised version, does anyone here know? Contact the authors or Arxiv? Here is the link to the post: Problems with Super-Tiny Language Models - LinkedIn Article 1 Quote Link to comment Share on other sites More sharing options...
Rajorshi Posted June 10 Share Posted June 10 I think we should reach out to the authors for more clarification on this. On a related note, I'm disappointed that MarkTechPost missed this glaring anomaly 1 Quote Link to comment Share on other sites More sharing options...
Aman Posted June 10 Share Posted June 10 Thanks for sharing, @Eckart Bindewald! And yes - agree with @Rajorshi that this needs to be brought up with the authors. 1 Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.