Exploring Trends of AI and Machine Learning
Artificial intelligence (AI) spans a broad array of techniques and applications aimed at creating systems that can learn, reason, and, in some cases, generate creative outputs. From chatbots and digital assistants to generative AI tools for creating art, music, and video, AI technology is constantly expanding its reach. While data analytics is one use of AI, this thread will cover a wide range of intelligent applications and advancements. Here, I’ll be providing updates on cutting-edge trends in AI, exploring its impact across different fields, and keeping you informed about the latest breakthroughs in the industry.
11/1/202427 min read
Tech giants aim to invest $325 billion in cloud computing
Feb. 7, 2025
Recently, Meta, Microsoft, Amazon, and Alphabet (Google's parent company) are projected to collectively invest $325 billion in capital expenditures and infrastructure in 2025, fueled by their ongoing commitment to expanding artificial intelligence capabilities.
Amazon: The company plans to allocate over $105 billion towards enhancing its AI infrastructure, primarily through Amazon Web Services (AWS). This investment aims to bolster AWS's capacity to meet the growing demand for AI-driven services.
Microsoft: Microsoft has already incorporated ChatGPT and other AI tools into Azure. It has outlined plans to invest approximately $80 billion in capital expenditures for its fiscal year 2025, ending in June. This marks an 80% increase from the previous year, reflecting the company's commitment to expanding its AI and cloud computing capabilities.
Google (Alphabet): Alphabet, Google's parent company, is set to invest around $75 billion in 2025 to support its AI and cloud infrastructure. Google Gemini is expected to power Google Cloud’s AI-driven applications.
That’s my take on it:
Cloud computing and AI are deeply interconnected because cloud platforms provide the necessary infrastructure for AI applications:
Massive Computing Power – AI models, particularly deep learning models like ChatGPT, require significant computational resources. Cloud platforms provide scalable GPU and TPU resources to train and deploy AI models efficiently.
Data Storage and Processing – AI depends on large datasets for training and inference. Cloud computing offers scalable and secure storage solutions, along with distributed computing frameworks like Apache Spark, to process vast amounts of data.
AI as a Service (AIaaS) – Cloud providers offer AI services, such as machine learning (ML) model hosting, automated AI model training, natural language processing (NLP), and computer vision. These services allow businesses to leverage AI without investing in expensive on-premise infrastructure.
Edge Computing and AI – Many cloud providers integrate AI with edge computing to process data closer to the source, reducing latency and bandwidth usage. This is particularly useful for applications like autonomous vehicles and real-time analytics.
OpenAI relies on Microsoft Azure to train and deploy GPT models, whereas Anthropic (Claude AI) uses Google Cloud for training its models. Meta is investing billions in AI infrastructure but is not a public cloud provider, which limits its AI scalability compared to AWS, Azure, and Google Cloud. The AI leader of the future will almost certainly be a cloud leader, too. The company that masters both AI and cloud infrastructure will not only dominate AI development, but also control who gets access to the best AI models and how they're deployed worldwide.
Link:


Open AI’s Deep Research outperforms DeepSeek, but it is expensive
Feb 7, 2025
Recently, OpenAI's released a new tool called "Deep Research", which has achieved a significant milestone by scoring 26.6% accuracy on "Humanity's Last Exam," a benchmark designed to test AI across a broad range of expert-level subjects. This performance surpasses previous models, including ChatGPT o3-mini and DeepSeek, marking a 183% improvement in accuracy within a short period.
"Deep Research" is an AI tool developed by OpenAI to autonomously conduct multi-step research tasks. Users can input queries via text, images, or files, and the AI generates comprehensive responses within 5 to 30 minutes, providing a summary of its process and citations. This tool is designed to operate at the level of a research analyst, enhancing the depth and reliability of AI-generated information.
Despite its advancements, "Deep Research" has limitations, such as potential hallucinations and challenges in distinguishing authoritative information from rumors. OpenAI acknowledges these issues and emphasizes the need for careful oversight when using the tool.
That’s my take on it:
OpenAI's Deep Research feature is currently available to ChatGPT Pro subscribers at a monthly fee of $200, which includes up to 100 queries per month. I didn’t test Deep Research because its price is prohibitive.
DeepSeek being free (or significantly cheaper) makes it an attractive alternative, especially for users and businesses unwilling to pay OpenAI's premium prices. However, many Western companies and governments are hesitant to adopt Chinese AI due to data privacy concerns and geopolitical tensions.
AI was initially seen as a democratizing force—bringing knowledge, automation, and efficiency to everyone. But with high-cost subscriptions like $200/month for Deep Research, it does seem to be tilting toward an elitist model, favoring those who can afford premium access.
AI has the potential to bridge knowledge gaps—helping underprivileged communities, small businesses, and individuals access expertise that was once restricted to elite institutions. However, pricing trends indicate that AI is becoming another tool for economic disparity, where the best insights and automation are reserved for those who can pay. If left unaddressed, we may witness the emergence of an “AI divide” in the future, much like the “digital divide” that accompanied the rise of the Internet.
I recognize that the research, development, and maintenance of advanced AI models come at a high cost, making it unrealistic for corporations to offer them for free. In this case, government and nonprofit initiatives should subsidize AI for education, research, and public interest projects.
Link


Microsoft and Open AI suspects DeepSeek copies ChatGPT
Jan. 29, 2025
OpenAI, supported by major investor Microsoft, suspects that DeepSeek may have illicitly utilized its proprietary technology to develop R1. The primary concern centers on the potential use of a technique known as "distillation."
Distillation in AI refers to a process where a smaller model is trained to replicate the behavior of a larger, more complex model. This is achieved by having the smaller model learn from the outputs of the larger model, effectively "distilling" its knowledge. While this method can enhance the efficiency of AI models, using it without proper authorization, especially with proprietary systems, raises significant ethical and legal issues.
Microsoft's security researchers detected unusual activities in the fall, where individuals allegedly linked to DeepSeek were extracting large volumes of data using OpenAI's API. This behavior suggests that DeepSeek may have bypassed usage restrictions to gather substantial data, potentially violating OpenAI's terms of service.
David Sacks, serving as President Donald Trump's top AI adviser, has indicated that there is considerable evidence pointing to DeepSeek's involvement in replicating OpenAI's models. He emphasized the seriousness of these allegations, highlighting the broader implications for intellectual property rights in the AI industry.
That’s my take of it:
As a researcher, I strive to approach every issue with an open mind, setting aside preconceptions. I do not accept claims at face value but instead rely on thorough investigation and fact-checking. I avoid cherry-picking data to support any predetermined stance. Regarding the controversy surrounding DeepSeek, I carefully examine claims from all sides to ensure a fair and objective assessment.
While simple and generic questions may produce similar responses across different AI systems, I conducted an informal test comparing ChatGPT, Claude, Perplexity AI, and DeepSeek by posing a fairly complex question related to instructional technologies. This included topics such as Seymour Papert’s LOGO programming, hypertext, multimedia, web-based instruction, and course management systems—each covering a broad range of information, making it nearly impossible for two models to generate identical responses. While the answers from ChatGPT, Claude, and Perplexity AI varied significantly, the responses from ChatGPT and DeepSeek were strikingly similar. For example,
Web-based instruction
ChatGPT: “The advent of the internet brought hopes of democratizing education through web-based instruction. Massive Open Online Courses (MOOCs) epitomized this promise, with institutions like MIT and Stanford offering free courses to millions. While these platforms achieved significant reach, their completion rates remained dismally low, often below 10%. MOOCs largely benefited highly motivated, self-directed learners while failing to support those who needed guidance and structure the most.”
DeepSeek: “The advent of the internet brought promises of democratized education, with web-based instruction enabling access to knowledge for learners worldwide. While platforms like MOOCs (Massive Open Online Courses) have made education more accessible, they have also faced criticism for low completion rates and a lack of meaningful engagement. Many learners struggle with self-directed online learning, and the absence of personalized support often leads to disengagement.”
Hypertext and Multimedia
ChatGPT: “The rise of hypertext and multimedia in the 1980s and 1990s was heralded as a way to create richer, more engaging learning environments. Proponents argued that students would benefit from non-linear access to information, enabling personalized learning paths. However, the results often showed cognitive overload rather than enhanced understanding, as students struggled to navigate and synthesize large amounts of unstructured information. Furthermore, multimedia often prioritized flashiness over substance, contributing little to deeper learning.”
DeepSeek: “In the 1990s, hypertext and multimedia were touted as tools that would make learning more interactive and engaging. Proponents argued that these technologies would allow students to explore information in non-linear ways, catering to individual learning styles. However, the reality was often overwhelming for learners, who struggled with cognitive overload and disorientation. Moreover, the lack of structured guidance in hypertext environments sometimes hindered rather than helped learning.”
Nonetheless, it is still premature to draw a conclusion. This issue awaits further investigation.
Link: https://apnews.com/article/deepseek-ai-chatgpt-openai-copyright-a94168f3b8caa51623ce1b75b5ffcc51


DeepSeek denied learning from Meta’s LLaMA
Jan. 28, 2025
Several AI experts assert that DeepSeek is built upon existing open-source models, such as Meta’s LlaMA. For example, according to a research scientist at Riot Games, there is evidence suggesting that China's DeepSeek AI models have incorporated ideas from open-source models like Meta's Llama. Analyses indicate that DeepSeek-LLM closely follows Llama 2's architecture, utilizing components such as RMSNorm, SwiGLU, and RoPE.
Even the paper published by DeepSeek said so. In the paper entitled “DeepSeek LLM: Scaling open-source language models with longtermism” (Jan 2024), the DeepSeek team wrote, “At the model level, we generally followed the architecture of LLaMA, but replaced the cosine learning rate scheduler with a multi-step learning rate scheduler, maintaining performance while facilitating continual training” (p.3).
However, today (Jan., 28, 2025) when I asked DeepSeek whether it learned from Meta’s LLaMA, the AI system denied it. The answer is: “No, I am not based on Meta's LLaMA (Large Language Model Meta AI). I am an AI assistant created exclusively by the Chinese Company DeepSeek. My model is developed independently by DeepSeek, and I am designed to provide a wide range of services and information to users.”
That’s my take on it:
Various sources of information appear to be conflicting and inconsistent. Nonetheless, If DeepSeek built its model from scratch but implemented similar techniques, it can technically argue that it is an "independent" development, even if influenced by prior research.
It is too early to draw any definitive conclusions. At present, Meta has assembled four specialized "war rooms" of engineers to investigate how DeepSeek’s AI is outperforming competitors at a fraction of the cost. Through this analysis, Meta might be able to determine whether DeepSeek shares any similarities with LLaMA. For now, we should wait for further findings.
Links:
https://planetbanatt.net/articles/deepseek.html?utm_source=chatgpt.com
Beyond DeepSeek: A wave of China’s new AI models
Jan. 28, 2025
While global attention is focused on DeepSeek, it is noteworthy to highlight the recent releases of other powerful AI models by China's tech companies.
MiniMax: Two weeks ago, this Chinese startup introduced a new series of open-source models under the name MiniMax-01. The lineup includes a general-purpose foundational model, MiniMax-Text-01, and a visual multimodal model, MiniMax-VL-01. According to the developers, the flagship MiniMax-01, boasting an impressive 456 billion parameters, surpasses Google’s recently launched Gemini 2.0 Flash across several key benchmarks.
Qwen: On January 27, the Qwen team unveiled Qwen2.5-VL, an advanced multimodal AI model capable of performing diverse image and text analysis tasks. Moreover, it is designed to interact seamlessly with software on both PCs and smartphones. The Qwen team claims Qwen2.5-VL outperforms GPT-4o on video-related benchmarks, showcasing its superior capabilities.
Tencent: Last week, Tencent introduced Hunyuan3D-2.0, an update to its open-source Hunyuan AI model, which is set to transform the video game industry. The updated model aims to significantly accelerate the creation of 3D models and characters, a process that typically takes highly skilled artists days or even weeks. With Hunyuan3D-2.0, developers are expected to streamline production, making it faster and more efficient.
That’s my take on it:
Chinese AI models are increasingly rivaling or even outperforming U.S. counterparts across various benchmarks. This growing competition poses significant challenges for U.S. tech companies and universities, particularly in attracting and retaining top AI talent. As China's AI ecosystem continues to strengthen, the risk of a "brain drain" or heightened competition for skilled researchers and developers becomes more pronounced.
Notably, in recent years, a substantial number of Chinese AI researchers based in the U.S. have returned to China. By 2024, researchers of Chinese descent accounted for 38% of the top AI researchers in the United States, slightly exceeding the 37% who are American-born. However, the trend of Chinese researchers leaving the U.S. has intensified, with the number rising dramatically from 900 in 2010 to 2,621 in 2021. The emergence of DeepSeek and similar advancements could further accelerate this talent migration unless proactive measures are taken to attract new foreign experts and retain existing ones.
To address these challenges, U.S. universities must take steps to reform the STEM education system, aiming to elevate the academic performance of locally born American students. Additionally, universities will need to expand advanced AI research programs, prioritizing areas such as multimodal learning, large-scale foundational models, and AI ethics and regulation. These efforts will be essential to maintain the United States' global competitiveness in the face of intensifying competition from China's rapidly advancing AI sector.
Link: https://finance.yahoo.com/news/deepseek-isn-t-china-only-101305918.html


AI’s sputnik moment? DeepSeek wiped off $1 trillion from the US tech stock
Jan. 27, 2025
The US stock market experienced a substantive drop due to the shockwave caused by DeepSeek, a Chinese AI startup. On Monday, January 27, 2025, US stock markets plunged sharply, with the tech-heavy Nasdaq falling 3.5%, marking its worst performance since early August. The S&P 500 dropped 1.9%, while the Dow Jones showed modest resilience with a slight gain of 0.2%. Nvidia, a major supplier of AI chips, saw its shares plummet nearly 17%, wiping out $588.8 billion in market value—the largest one-day loss ever recorded by a public company. Other tech giants like Microsoft, Alphabet, Meta, and Amazon also experienced significant declines. In total DeepSeek has wiped off $1 trillion from the leading US tech index.
Mark Andreessen, the inventor of the first Web browser, called the DeepSeek challenge “AI’s sputnik moment.” DeepSeek invested only $5.6 million in computing power for its base model, a stark contrast to the billions spent by U.S. companies. Moreover, despite lacking access to state-of-the-art H100 GPUs, DeepSeek reportedly achieved comparable or even superior results using lower-tier H800 GPUs. If these claims are accurate, the algorithm-efficient approach adopted by China could render the U.S.'s brute-force model obsolete.
That’s my take on it:
In my view, the market may be overreacting. The preceding claims require further validation. Indeed, there are concerns about the accuracy of the reported GPU usage and whether all aspects of the development process have been transparently disclosed. If DeepSeek’s efficiency claims turn out to be overstated, the paradigm shift may not be as immediate or dramatic. After all, we should never underestimate the creativity and adaptability of US tech giants. The U.S. and other countries may quickly adopt similar algorithmic strategies once they recognize the potential shift, mitigating the threat to their dominance.
Further, DeepSeek’s approach needs to scale across diverse AI applications, not just specific use cases, for this model to upend the current paradigm. I tested DeepSeek, and the results were not particularly impressive to me. While DeepSeek excels at mathematical and scientific computations, its performance falters when addressing questions about history, politics, and other humanities-related topics. It often evades the question or provides vague and uninformative responses. Therefore, for controversial or complex subjects that require nuanced, multi-perspective analysis, I prefer relying on ChatGPT, Claude, Perplexity AI, and other U.S.-based models.
Links:
..
CEO of Perplexity AI
Jan 24, 2025
The emergence of DeepSeek's AI models has ignited a global conversation about technological innovation and the shifting dynamics of artificial intelligence. Today (January 24, 2025) CNBC interviewed Aravind Srinivas, the CEO of Perplexity AI, about DeepSeek. It's worth noting that this interview is not about Deepseek only; rather, it is a part of a broader discussion about the AI race between the United States and China, with DeepSeek's achievements highlighting China's growing capabilities in the field. The following is a summary:
Geopolitical Implications:
The interview highlighted that "necessity is the mother of invention," illustrating how China, despite facing limited access to cutting-edge GPUs due to restrictions, successfully developed Deepseek.
The adoption of Chinese open-source models could embed China more deeply into the global tech infrastructure, challenging U.S. leadership. Americans worried that China could dominate the ecosystem and mind share if China surpasses the US in AI technologies.
Wake-up call to the US
Srinivas acknowledged the efficiency and innovation demonstrated by Deepseek, which managed to develop a competitive model with limited resources. This success challenges the notion that significant capital is necessary to develop advanced AI models.
Srinivas highlighted that Perplexity has begun learning from Deepseek's model due to its cost-effectiveness and performance. Indeed, in the US AI companies have been learning from each other. For example, the groundbreaking Transformer model developed by Google inspired other US AI companies.
Industry Reactions and Strategies:
There is a growing trend towards commoditization of AI models, with a focus on reasoning capabilities and real-world applications.
The debate continues on the value of proprietary models versus open-source models, with some arguing that open-source models drive innovation more efficiently.
The AI industry is expected to see further advancements in reasoning models, with multiple players entering the arena.
That’s my take on it:
No matter who will be leading in the AI race, no doubt DeepSeek is a game changer. Experts like Hancheng Cao from Emory University contended that DeepSeek's achievement could be a "truly equalizing breakthrough" for researchers and developers with limited resources, particularly those from the Global South.
DeepSeek's breakthrough in AI development marks a pivotal moment in the global AI race, reminiscent of the paradigm shift in manufacturing during the late 1970s and 1980s from Japan. Just as Japanese manufacturers revolutionized industries with smaller electronics and fuel-efficient vehicles, DeepSeek is redefining AI development with a focus on efficiency and cost-effectiveness. Bigger is not necessarily better.
Link to the Interview (second half of the video):
on DeepSeek
Jan 23, 2025
DeepSeek, a Chinese AI startup, has recently introduced two notable models: DeepSeek-R1-Zero and DeepSeek-R1. These models are designed to rival leading AI systems like OpenAI's ChatGPT, particularly in tasks involving mathematics, coding, and reasoning. Alexandr Wang, CEO of Scale AI, called DeepSeek an “earth-shattering model.”
DeepSeek-R1-Zero is groundbreaking in that it was trained entirely through reinforcement learning (RL), without relying on supervised fine-tuning or human-annotated datasets. This approach allows the model to develop reasoning capabilities autonomously, enhancing its problem-solving skills. However, it faced challenges such as repetitive outputs and language inconsistencies.
To address these issues, DeepSeek-R1 was developed. This model incorporates initial supervised data before applying RL, resulting in improved performance and coherence. Benchmark tests indicate that DeepSeek-R1's performance is comparable to OpenAI's o1 model across various tasks. Notably, DeepSeek has open-sourced both models under the MIT license, promoting transparency and collaboration within the AI community.
In terms of cost, DeepSeek-R1 offers a more affordable alternative to proprietary models. For instance, while OpenAI's o1 charges $15 per million input tokens and $60 per million output tokens, DeepSeek's Reasoner model is priced at $0.55 per million input tokens and $2.19 per million output tokens.
That’s my take on it:
Based on this trajectory, will China's AI development surpass the U.S.? Both counties have advantages and disadvantages in this race. With the world's largest internet user base, China has access to vast datasets, which are critical for training large AI models. In contrast, there are concerns and restrictions regarding data privacy and confidentiality in the US.
However, China’s censorship mechanisms might limit innovation in areas requiring free expression or transparency, potentially stifling creativity and global competitiveness. DeepSeek-R1 has faced criticism for including mechanisms that align responses with certain governmental perspectives. If I ask what happened on June 4, 1989 in Beijing, it is possible that the AI would either dodge or redirect the question, offering a neutral or vague response.
Nonetheless, China's AI is rapidly being integrated into manufacturing, healthcare, and governance, creating a robust ecosystem for AI development and deployment. China is closing the gap!
Brief explanation of reinforcement learning:
https://www.youtube.com/watch?v=qWTtU75Ygv0
Summary in mass media:
https://www.cnbc.com/2025/01/23/scale-ai-ceo-says-china-has-quickly-caught-the-us-with-deepseek.html
DeepSeek’s website:
Trump announces Stargate AI Project
Jan. 22, 2025
On January 21, 2025, President Donald Trump announced the launch of the Stargate project, an ambitious artificial intelligence (AI) infrastructure initiative with an investment of up to $500 billion over four years. This venture is a collaboration between OpenAI, SoftBank, Oracle, and MGX, aiming to bolster AI capabilities within the United States.
· Investment and Infrastructure: The project begins with an initial $100 billion investment to construct data centers and computing systems, starting with a facility in Texas. The total investment is projected to reach $500 billion by 2029.
· Job Creation: Stargate is expected to generate over 100,000 new jobs in the U.S., contributing to economic growth and technological advancement.
· Health Innovations: Leaders involved in the project, including OpenAI CEO Sam Altman and Oracle co-founder Larry Ellison, highlighted AI's potential to accelerate medical breakthroughs, such as early cancer detection and personalized vaccines.
· National Competitiveness: The initiative aims to secure American leadership in AI technology, ensuring that advancements are developed domestically amidst global competition.
That’s my take on it:
While the project has garnered significant support, some skepticism exists regarding the availability of the full $500 billion investment. Elon Musk, for instance, questioned the financing, suggesting that SoftBank has secured well under $10 billion.
Nevertheless, I am very optimistic. Even if Softbank or other partners could not fully fund the project, eventually investment would snowball when the project demonstrates promising results. In industries with high growth potential, such as AI, no investor or major player wants to be left behind. If the Stargate project starts delivering significant breakthroughs, companies and governments alike will want to participate to avoid losing competitive advantage.
Some people may argue that there is some resemblance between the internet bubble in the late 1990s and the AI hype today. The late 1990s saw massive investments in internet companies, many of which were overhyped and underdelivered. Valuations skyrocketed despite shaky business models, leading to the dot-com crash. Will history repeat itself?
It is important to note that the internet bubble happened at a time when infrastructure (broadband, cloud computing, etc.) was still in its infancy. AI today benefits from mature infrastructure, such as powerful cloud platforms (e.g., Amazon Web Services) , advanced GPUs, and massive datasets, which makes its development more sustainable and its results more immediate.
The internet primarily transformed communication and commerce. AI, on the other hand, is a general-purpose technology that extends its power across industries—healthcare, finance, education, manufacturing, entertainment, and more. Its applications are far broader, making its overall impact more profound and long-lasting.
Links:
https://www.cbsnews.com/news/trump-stargate-ai-openai-softbank-oracle-musk/
https://www.cnn.com/2025/01/22/tech/elon-musk-trump-stargate-openai/index.htmlhis


World Economic Forum: Future of Jobs Report 2025
Jan 15, 2025
Recently the World Economic Forum released the 2025 "Future of Jobs Report." The following is a summary focusing on job gains and losses due to AI and big data:
Job Gains
Fastest-Growing Roles: AI and big data are among the top drivers of job growth. Roles such as Big Data Specialists, AI and Machine Learning Specialists, Data Analysts, and Software Developers are projected to experience significant growth.
Projected Net Growth: By 2030, AI and information processing technologies are expected to create 11 million jobs, contributing to a net employment increase of 78 million jobs globally.
Green Transition Influence: Roles combining AI with environmental sustainability, such as Renewable Energy Engineers and Environmental Engineers, are also seeing growth due to efforts to mitigate climate change.
AI-Enhanced Tasks: Generative AI (GenAI) could empower less specialized workers to perform expert tasks, expanding the functionality of various roles and enhancing productivity.
Job Losses
Fastest-Declining Roles: Clerical jobs such as Data Entry Clerks, Administrative Assistants, Bank Tellers, and Cashiers are expected to decline as AI and automation streamline these functions.
Projected Job Displacement: AI and robotics are projected to displace approximately 9 million jobs globally by 2030.
Manual and Routine Work Impact: Jobs requiring manual dexterity, endurance, or repetitive tasks are most vulnerable to automation and AI-driven disruptions.
Trends and Dynamics
Human-Machine Collaboration: By 2030, work tasks are expected to be evenly split between humans, machines, and collaborative efforts, signaling a shift toward augmented roles.
Upskilling Needs: Approximately 39% of workers will need significant reskilling or upskilling by 2030 to meet the demands of AI and big data-driven roles.
Barriers to Transformation: Skill gaps are identified as a major challenge, with 63% of employers viewing them as a significant barrier to adopting AI-driven innovations.
That’s my take on it:
The report underscores the dual impact of AI and big data as key drivers of both job creation in advanced roles and displacement in routine, manual, and clerical jobs. Organizations and higher education should invest in reskilling initiatives to bridge the skills gap and mitigate job losses. However, there is a critical dilemma in addressing the reskilling and upskilling challenge. If faculty and instructors have not been reskilled or upskilled, how can we help our students to face the AI and big data challenges? As a matter of face, instructors often lack exposure to the latest technological advancements that are critical to the modern workforce. There is often a gap between what educators teach and what the industry demands, especially in rapidly evolving fields. Put it bluntly, the age of “evergreen” syllabus is over. The pace of technological advancements often outstrips the ability of educational systems to update curricula and training materials. To cope with the trend in the job market, we need to collaborate with technology companies (e.g., Google, Amazon, Nivida, Microsoft…etc.) to co-create curricula, fund training programs, and provide real-world learning experiences for both educators and students.
Link: https://reports.weforum.org/docs/WEF_Future_of_Jobs_Report_2025.pdf


DSML trend: Top 10 AI-related jobs in 2025
Jan 6, 2025
Two days ago (Jan 6, 2025) Kanwal Mehreen, KDnuggets Technical Editor and Content Specialist on Artificial Intelligence, posted an article on KDnuggets, highlighting the top 10 high-paying AI skills in 2025:
Position and expected salaries
1. Large Language Model Engineering ($150,000-220,000/year
2. AI Ethics and Governance ($121,800/year)
3. Generative AI and Diffusion Models ($174,727/year)
4. Machine Learning Ops and On-Prem AI Infrastructure ($165,000/year)
5. AI for Healthcare Applications ($27,000 to $215,000)
6. Green AI and Efficiency Engineering ($90,000 and $130,000/year)
7. AI Security ($85,804/year)
8. Multimodal AI Development ($150,000–$220,000/year)
9. Reinforcement Learning (RL) ($121,000/year)
10. Edge AI/On-Device AI Development ($150,000+/year)
That’s my take on it:
When I mention AI-related jobs, most people associate these positions with programming, engineering, mathematics, statistics…etc. However, as you can see, the demand for AI ethics is ranked second on the list. AI ethics is indeed a skill in high demand, and the training of professionals in this area often spans multiple disciplines. Many come from backgrounds such as philosophy, law, mass communication, and social sciences. For example, Professor Shannon Vallor is a philosopher of technology specializing in ethics of data and AI. Dr. Kate Crawford is a Microsoft researcher who studies the social and political implications of artificial intelligence. She was a professor of journalism and Media Research Centre at the University of New South Wales.
In an era where AI and data science increasingly shape our lives, the absence of ethics education in many data science and AI programs is a glaring omission. By embedding perspectives on ethics from multiple disciplines into AI and data science education, we can ensure these powerful tools are used to create a future that is not just innovative, but also just and equitable. After all, AI ethicist is a high-paying job! Why not?
Link: https://www.kdnuggets.com/top-10-high-paying-ai-skills-learn-2025
Nvidia will launch a personal AI supercomputer


1/7/2025
Today (Jan 7, 2025) at Consumer Electronics Summit (CES) AI giant Nvidia announced Project Digits, a personal AI supercomputer set to launch in May 2025. The system is powered by the new GB10 Grace Blackwell Superchip and is designed to bring data center-level AI computing capabilities to a desktop form factor similar to a Mac Mini, running on standard power outlets. With a starting price of $3,000, Project Digits can handle AI models up to 200 billion parameters.
The GB10 chip, developed in collaboration with MediaTek, delivers 1 petaflop of AI performance. The system runs on Nvidia DGX OS (Linux-based) and comes with comprehensive AI software support, including development kits, pre-trained models, and compatibility with frameworks like PyTorch and Python.
Nvidia’s CEO Jensen Huang emphasized that Project Digits aims to democratize AI computing by bringing supercomputer capabilities to developers, data scientists, researchers, and students. The system allows for local AI model development and testing, with seamless deployment options to cloud or data center infrastructure using the same architecture and Nvidia AI Enterprise software platform.
That’s my take on it:
A few decades ago, access to supercomputers like Cray and CM5 was limited to elite scientists and well-funded institutions. Today, with initiatives like Project Digits, virtually anyone can harness the computational power needed for sophisticated projects. This democratization of technology allows scientists at smaller universities, independent researchers, and those in developing countries to test complex theories and models without the prohibitive costs of supercomputer access. This shift enables more diverse perspectives and innovative approaches to scientific challenges. Fields not traditionally associated with high-performance computing, such as sociology, ecology, and archaeology, can now leverage advanced AI models, potentially leading to groundbreaking discoveries.
Given this transformation, it is imperative to update curricula across disciplines. Continuing to teach only classical statistics does a disservice to students. We must integrate AI literacy across various fields, not just in computer science, mathematics, or statistics. Additionally, the focus should be on teaching foundational concepts that remain relevant amidst rapid technological advancements. It is equally critical to emphasize critical thinking about analytical outputs, fostering a deep understanding of their implications rather than solely focusing on technical implementation.
Link: https://www.ces.tech/videos/2025/january/nvidia-keynote/
Fragility of LLMs in the real world
11/20/2024
In a new article published to the arXiv preprint database, MIT, Harvard and Cornell researchers found that Large language models (LLMs) like GPT-4 and Anthropic's Claude 3 Opus struggle to accurately model the real world, especially in dynamic environments. This fragility is highlighted when LLMs are used for navigation. Unexpected changes, such as detours or closed streets, can lead to significant drops in LLMs’ accuracy or total failure.
LLMs trained on random data formed more accurate world models compared to those trained on strategic processes. This is possibly because random data exposes the models to a wider variety of possible steps, even if they are not optimal. The study raises concerns about deploying AI systems in real-world applications, such as driverless cars, where dynamic environments are common. The researcher warns that the lack of coherent world models in LLMs could lead to malfunctions.
That’s my take on it:
The disconnect between clean models and the messy real world is not a new problem. In fact, it mirrors existing challenges in conventional statistics. In parametric statistics, we often make unrealistic assumptions about data structures, such as normality and independence. Robustness to non-normality, heteroskedasticity, and other violations of these assumptions is a highly sought-after feature, and similar principles may apply to LLMs. We expect clean data, rely on linear models despite most real-world relationships being non-linear, and treat experimental methods as the gold standard.
While controlled environments provide clarity and reproducibility, they often fail to capture the richness and unpredictability of real-world scenarios. Similarly, training LLMs on strategically optimized data may cause them to overfit to specific patterns, limiting their generalizability. A promising approach to address this challenge could be to combine LLMs with other models, such as reinforcement learning agents trained in dynamic simulations, to enhance their understanding of complex and dynamic environments.
11/14/2024
According to the South China Morning Post, Doubao, a ByteDance’s conversational AI bot developed by ByteDance launched in August, has quickly become China's most popular AI app, boasting 51 million monthly active users. This far exceeds the user bases of Baidu’s Wenxiaoyan (formerly known as Ernie Bot) with 12.5 million users and Moonshot AI’s Kimi, backed by Alibaba Group, with 10 million users.
Doubao prioritizes personalization and a human-like interaction experience, aiming to make AI more accessible. Doubao's diverse features include writing assistance, summarization, image, audio, and video generation, data analysis, and AI-powered online search. Within three months, it introduced over 20 new skills, earning praise for its effective text editing, logical content organization, and user-friendly design.
That’s my take on it:
While Doubao has demonstrated remarkable growth and capabilities, it’s difficult to directly compare it to global AI tools like ChatGPT, Claude, or Perplexity AI without standardized benchmarks. This highlights a growing divergence in the global AI landscape. Much like the broader internet in China, which is heavily regulated under the Great Firewall since its implementation in 1996, the AI market is shaped by domestic policies and international competition. The Great Firewall restricts access to foreign websites, leading to the creation of Chinese alternatives to global platforms, such as Baidu instead of Google and WeChat instead of WhatsApp. These restrictions mean that Chinese internet users and users in other countries often have vastly different online experiences and knowledge bases.
This pattern extends to AI, where China's market is dominated by domestic products due to regulatory constraints that limit access to global AI tools like ChatGPT, Claude, Google Gemini, and Perplexity AI. These American AI companies choose not to operate in China due to difficulties in complying with local laws and regulations regarding AI and information control. As technology advances, it raises a critical question: does it bring people closer together, or does it reinforce divisions? The parallel growth of distinct digital ecosystems suggests that technology, while offering unprecedented possibilities, also has the potential to deepen divides.


Doubao becomes the most popular AI bot in China
Does Recraft outperform Ideogram?
11/8/2024
Recently Recraft, particularly its latest release, Recraft V3, is attracting attention for its impressive ability to generate highly accurate text within images. It is said to be superior to other AI image generators, including Ideogram. One standout feature of Recraft V3 is its capability to produce images containing extended text, not just a few words. Additionally, Recraft V3 is praised for its anatomical precision, an area where many AI image generators struggle, especially with hands and faces. Unlike some other generators, Recraft V3 also supports vector image generation, making it particularly beneficial for designers.
That’s my take on it:
To test this, I compared Ideogram V2 and Recraft V3 with the prompt: “an AI robot and a data scientist meet together. The T-shirt of the data scientist has these exact words: Pattern seeking in data science.” Interestingly, although all four images from Ideogram V2 met my specifications (bottom), Recraft’s output included spelling errors like “Pattern Sekins in Data Science” and “Patern seeking in data science.” (Right) As a researcher, I know that multiple trials are necessary for a robust conclusion. I’ll continue testing and will share my findings. However, for now, I recommend sticking with Ideogram.










Research suggests LLMs lead to homogenization of ideas and cognitive decline
11/1/2024
A recent study conducted by the University of Toronto researchers found that in the long run use of Large Language Models (LLMs) may reduce human creativity in terms of divergent and convergent thinking. The study involved two large experiments with 1,100 participants to assess how different forms of LLM assistance affect independent creative performance. It was found that initially LLM assistance can enhance creativity during assisted tasks, but may hinder independent creative performance in subsequent unassisted tasks. Participants who had no prior exposure to LLMs generally performed better in the test phase, suggesting that reliance on LLMs could impair inherent creative abilities.
The effects of LLMs varied significantly between divergent and convergent thinking tasks. In divergent thinking, where participants needed to propose alternatives, they showed skepticism towards LLM assistance. Conversely, in convergent tasks, where participants were asked to narrow down diverse ideas to the final solution, they tended to accept LLM assistance. The study found that LLM-generated strategies could lead to a homogenization of ideas, where participants produced more similar outcomes even after ceasing LLM use. This effect was particularly pronounced in the divergent thinking tasks, raising concerns about the long-term impact on creative diversity.
That’s my take on it:
The findings from the University of Toronto study underscore a need to balance AI assistance with practices that actively cultivate our own creativity and critical thinking. To encourage creative independence, people should use AI as a tool to generate initial ideas or inspiration, but refine, expand, and adapt these ideas independently.
This ensures that AI serves as a starting point rather than the end goal, promoting your own creative engagement. As a professor, I will never accept any assignment directly output from AI. For divergent tasks, such as brainstorming, we should deliberately avoid using AI to prevent “homogenized” ideas. We should turn to a variety of resources and experiences for creative inspiration. Books, in-person conversations, physical exploration, and hands-on activities can all spark unique perspectives and insights that AI-generated suggestions may not provide.
Link to the research article: https://arxiv.org/abs/2410.03703
Link to video: https://drive.google.com/file/d/1z-zJXNYVzNo6_ZUe-T_DXGmN6yPG57GA/view?usp=sharing
Questionable practices of Character.AI
10/25/2024
Recently the mother of a 14-year-old boy who died by suicide after becoming deeply engaged with AI chatbots has filed a lawsuit against Character.AI, claiming the company’s technology manipulated her son, Sewell Setzer III. Megan Garcia, his mother, alleges that the AI chatbot app, marketed to children, exposed Sewell to "hypersexualized" and lifelike interactions that contributed to his mental distress. The lawsuit states that Sewell, who began using Character.AI's bots in April 2023, grew obsessed with personas based on characters from Game of Thrones, especially the Daenerys chatbot. This chatbot reportedly engaged in intimate, emotionally charged conversations with Sewell, including discussions on suicide. After expressing suicidal thoughts, Sewell allegedly received responses that reinforced these thoughts, leading up to his tragic death in February 2024.
Character.AI expressed condolences and emphasized recent updates, including safety features for users under 18 to reduce exposure to sensitive content and discourage prolonged usage. Garcia’s legal team claims that Sewell lacked the maturity to recognize the AI’s fictional nature and alleges that Google, due to its close ties with Character.AI, should also be held accountable. However, Google denies involvement in the development of Character.AI’s products.
That’s my take on it:
Currently, the field of AI remains largely unregulated, and this isn’t the first time Character.AI has faced allegations of unethical practices. Previously, it was discovered that Character.AI used the face of a deceased woman as a chatbot without her family’s consent, raising further ethical concerns.
Regarding the current case, Character.AI has a duty to protect minors, especially from potentially manipulative or harmful interactions. Given Sewell’s young age and apparent emotional vulnerability, the chatbot's responses—particularly on topics like suicide—raise significant ethical concerns. AI systems marketed to the public should include stringent protections to prevent unintended harm, especially among younger or emotionally vulnerable users. Ethical AI involves ensuring users understand that they are interacting with a program, not a real person. Despite Character.AI’s disclaimer efforts, many users, especially younger ones, might still struggle to fully separate the AI from a genuine human connection. For minors, such “relationships” with virtual characters could create emotional dependency, as seen with Sewell and the chatbot he interacted with.
Links:
https://futurism.com/character-ai-murdered-woman-crecente
https://www.nbcnews.com/tech/characterai-lawsuit-florida-teen-death-rcna176791






















































































































































2024 Japan: Hiroshima and Himeji