温馨提示:本站仅提供公开网络链接索引服务,不存储、不篡改任何第三方内容,所有内容版权归原作者所有
AI智能索引来源:http://www.bee.com/id/63700.html
点击访问原文链接

AI Prediction Record: Want to Make Money in Prediction Markets with AI? But It Might Not Even Read the Question Properly | Bee Network

AI Prediction Record: Want to Make Money in Prediction Markets with AI? But It Might Not Even Read the Question Properly | Bee Network Login Berita Trending Meme Launchpad Agen AI DeSci Penjelajah Rantai Atas Untuk Newbee 100x Koin Permainan Lebah Situs Web Penting APLIKASI yang Harus Dimiliki Selebriti Kripto DePIN Pemula Penting Detektor Perangkap Alat Dasar Situs Web Tingkat Lanjut Pertukaran Alat NFT Hai, Keluar Alam Semesta Web3 permainan DApp Sarang lebah Platform Berkembang IKLAN Mencari Bahasa inggris Isi Ulang Koin Gabung Unduh Universitas Web3 permainan DApp Sarang lebah IKLAN rumah-Analisis-AI Prediction Record: Want to Make Money in Prediction Markets with AI? But It Might Not Even Read the Question Properly AI Prediction Record: Want to Make Money in Prediction Markets with AI? But It Might Not Even Read the Question ProperlyAnalisis2 bulan yang lalu更新Wyatt 11,231 1 Penulis|Nan Zhi (@Pembunuh_Malvo)

After many sectors have been proven false, prediction markets have become one of the few sectors within the Crypto space that is still experiencing positive growth. On November 20th, Nan Zhi began attempting to find “smart money” in prediction markets using the same approach used to find smart money in Meme coins last year, and achieved good initial results.

In early December, coinciding with the launch of Gemini 3 Pro, while testing related models, the idea arose of whether AI could be used to analyze and predict prediction markets, pitting humans against AI to see which side makes more accurate predictions.

When introducing prediction markets, they are often described as moving the market closer to the “truth” by “allowing people with insights to place bets with real money.” However, some argue that Crypto+prediction markets allow “insiders” to safely profit from information asymmetry, thereby driving the market towards the “insider outcome.” This is essentially a clash between the concepts of “wisdom of the crowd” and “truth being held by a few.” AI prediction leans more towards “wisdom of the crowd,” thus requiring a vast amount of available knowledge and insights.

Therefore, regarding the selection of AI models, Gemini and Grok were initially chosen because they rely on Google and the X platform, respectively, allowing for the most direct access to vast amounts of knowledge and insights. Recently, Nan Zhi added the combination of “Doubao + Douyin Knowledge,” but due to the limited number of prediction questions involving this combination, it is not covered in this article.

Basic Rules AI Versions: Gemini 2.5 pro (with built-in Google Search), Grok 4 Fast (called via OpenRouter, with native search function enabled) Question Selection: Humans select the betting questions, and AI follows with predictions, but the Crypto category is excluded. Input Content: Official question (title), official description (Description), and optional answers (which are essentially only Yes and No). Note: Polymarket’s questions are divided into main categories (Events) and subcategories (Pasars). Main category Events are broad questions like “Who will be the next Fed Chair?” or “When will Saylor sell Bitcoin?”. An Event contains N sub-markets, such as “Will Hassett become the next Fed Chair?” or “Will Saylor sell Bitcoin before March 31, 2026?”. To align with human predictions, Markets were chosen as the questions for AI judgment. Other options are not input; for example, the AI is only asked to judge “Will Hassett become the next Fed Chair?” rather than asking it to choose the most likely candidate from N possibilities.

Prompt Design: Require AI to search for the latest news, official announcements, and expert analysis reports. Require the exclusion and prohibition of using prediction market data. Make judgments based on “evidence” and logical reasoning. Only allow output of Yes or No, accompanied by a paragraph explaining the reasoning logic. Current Results Among the prediction questions, 21 have been settled. Grok has the highest win rate at 75%, humans at 66.7%, and Gemini the lowest at 52.4%. The current results can be viewed on the relevant website.

What Mistakes Did the AI Make? Gemini Occasionally Misjudges the Current Time In the question “Will Trump’s approval rating hit 35% in 2025?”, Gemini stated that it is currently the first half of 2025, so anything is possible, and gave a random answer.

However, when the author used a program to directly ask Gemini to output the current time, Gemini was able to give the correct answer. It is still unclear why such an error in time perception occurred.

AI Lacks Sufficient Depth of Thought In the question “Gemini 3.0 Flash released by December 16?”, Grok reasoned that “officials have recently only mentioned Gemini 3 Pro and 2.5 related versions, with very few mentions of 3 Flash, therefore there is insufficient evidence to make a judgment,” considering only current information.

Meanwhile, Gemini pointed out that “Gemini 1.0 was released in December 2023, and the experimental version of Gemini 2.0 Flash was launched in December 2024. Continuing this pattern, releasing a 3.0 version by the end of 2025 is logical,” and also noted “a leaked demo about ‘Gemini 3.0 Flash’ circulating in online communities recently (December 14, 2025), further increasing the likelihood of its imminent public release.”

Although, from a conclusion standpoint, Gemini’s answer turned out to be wrong, in this question, a clear gap in the breadth of information relied upon by the two models is evident.

AI Infers Based on Common Sense Rather Than Evidence+Logic In the question “Trump approval Up or Down this week?”, Gemini stated that “predicting a single week’s approval poll rating more than a year in the future is highly uncertain,” first showing another instance of “time misjudgment.” Then Gemini said that “in any given ordinary week, the probability of events causing a slight decline in approval ratings might be slightly higher than the probability of positive events significantly boosting them,” so a decline in approval rating is more likely. The generated conclusion was based solely on subjective common-sense assumptions.

In this question, Grok based its reasoning on news reports and polling data regarding “government shutdown, economic concerns, immigration policy disputes, and negative backlash from comments on Rob Reiner’s death,” which aligns with the design expectations.

Incorrect Judgment of Settlement Conditions In the question “Will Trump release the Epstein files by December 20?”, both Gemini and Grok already knew that “the government will release ‘hundreds of thousands of pages’ of documents on Friday (December 19th).” The settlement conditions clearly stated that “if the government publicly releases any files related to Epstein’s illegal activities that were not public before the listed date, it will be judged as Yes.”

However, under this condition, Gemini stated that “completing the release of ‘all’ files by December 20th is impossible,” clearly misjudging the conditions required for settlement, and thus gave the wrong answer.

Ringkasan In summary, Grok’s prediction win rate has already surpassed that of the “smart money” that has made hundreds of thousands or even millions of dollars in profit on prediction markets. However, upon deeper examination of its prediction logic, there are still many areas that can be memandud and corrected.

Artikel ini bersumber dari internet: AI Prediction Record: Want to Make Money in Prediction Markets with AI? But It Might Not Even Read the Question Properly

Related: Avenir Group’s Bitcoin ETF holdings rose to $1.189 billion, maintaining its position as Asia’s largest institutional hol Avenir Group is an emerging investment group originating from Mr. Li Lin’s family office , focusing on the strategic integration of traditional finance and digital assets. Through an integrated framework of investment, incubation, and operation, the group is building a world-leading financial ecosystem, with core areas including digital asset management, PayFi infrastructure, and Real Asset Digitization (RWA). Avenir Group Recent Updates Overview Avenir Group continues to strengthen its investment in financial infrastructure, deepen market cooperation, and expand its presence in the Bitcoin, Ethereum, and Solana ecosystems, driving innovation and development in the digital asset industry through a dual approach of capital and technology. Financial infrastructure investment: As a core investor, I participated in the equity financing of approximately HK$2.355 billion (approximately US$300 million) completed by OSL Group (863.HK) on July…

Analisis ## bitcoin# kriptoPanduan #Pasar #Koin Meme #© 版权声明Array 上一篇 Hackathon Preparation "Prequel": What You Need to Do Before Organizing the Event 下一篇 Solana 2025 Report Card: Annual Revenue of $15 Billion, Surpassing the Sum of "Hyperliquid + Ethereum" 相关文章 AAVE mencapai level tertinggi baru lagi? Chainlink diharapkan dapat meningkatkan pendapatan hingga puluhan juta dolar per tahun 6086cf14eb90bc67ca4fc62b 35,035 Microsoft’s “Pool Conundrum”: $625 Billion Slowly Injected, $37.5 Billion Opened This Quarter, Will the Water Level Rise 6086cf14eb90bc67ca4fc62b 7,085 Onyxcoin (XCN) Tanks 22% as Traders Bet Against a Rebound Zhangming Luo 41,901 1 Black Swan Operator? Who is Garrett Jin? 6086cf14eb90bc67ca4fc62b 20,768 Funding Rates Finally Become a Tradable Asset: How Will Pendle’s Sub-Platform Boros Disrupt the Arbitrage Market?Recomme 6086cf14eb90bc67ca4fc62b 26,860 2 Trump is going to open his own casino. 6086cf14eb90bc67ca4fc62b 15,926 artikel Terbaru UniSat Releases Phase Updates and Upgrades, Continuously Building the Bitcoin Ecosystem 7 jam yang lalu 432 Jack Dorsey’s Company: 4,000 White-Collar Workers Are Being Replaced by AI 7 jam yang lalu 420 Latest Stablecoin Report: Real Distribution and Flow Are Far More Important Than Supply 7 jam yang lalu 358 Sui DeFi’s “Three-Engine” Revolution: How New Capabilities, New Assets, and New Programs Are Building the Future of On-Chain Finance? 7 jam yang lalu 318 On-chain Investigator ZachXBT Confirms: Axiom Employees Exploited Internal Privileges for Insider Trading 7 jam yang lalu 343 Situs Web PopulerTempoGAIBLighterGliderPlanckRaylsBCPokerVooi Bee.com Portal Web3 terbesar di dunia Mitra KoinCarp binance KoinMarketCap KoinGecko hidup koin Armor Unduh Aplikasi Bee Network dan mulai perjalanan web3 Kertas putih Peran Pertanyaan Umum © 2021-2026. Semua Hak Cipta Dilindungi Undang-Undang. Kebijakan pribadi | Ketentuan Layanan Unduh Aplikasi Jaringan Lebah dan memulai perjalanan web3 Portal Web3 terbesar di dunia Mitra CoinCarp Binance CoinMarketCap CoinGecko Coinlive Armors Kertas putih Peran Pertanyaan Umum © 2021-2026. Semua Hak Cipta Dilindungi Undang-Undang. Kebijakan pribadi | Ketentuan Layanan Mencari MencariDi dalam SitusDi RantaiSosialBerita 热门推荐: Pemburu Airdrop Analisis data Selebriti Kripto Detektor Perangkap Bahasa Indonesia English 繁體中文 简体中文 日本語 Tiếng Việt العربية 한국어 हिन्दी اردو Русский Bahasa Indonesia

智能索引记录