- The Watchtower
- Posts
- The WatchTower: 18th Edition
The WatchTower: 18th Edition
Welcome to the captivating world of Artificial Intelligence!
Welcome to the 18th edition of the WatchTower! In this edition, we explore an innovative network architecture that addresses some fundamental limitations of current AI systems and discuss the potential of AI to revolutionise education.
š° Featured in This Edition:
Intro to Computer Vision Workshop
Kolmogorov-Arnold Networks (KANs) - Better than Multilayer Perceptrons (MLPs)?
The Impact of AI on EdTech: Shaping the Future of Learning
š Upcoming Events
Join our upcoming introductory workshop on computer vision! Weāll explore the theory behind todayās advanced applications and equip you with the practical skills to build and utilise computer vision models on your own. š
Date: Wednesday, 26 June 2024 |
Kolmogorov-Arnold Networks (KANs) - Better than Multilayer Perceptrons (MLPs)?
Nearly all of the AI models we encounter today are based on multilayer perceptrons (MLPs), which, despite their impressive capabilities, have some fundamental flaws that limit their accuracy and feasibility for widespread adoption in many industries. A recent paper by researchers at MIT, Caltech, and Northeastern University introduces an alternative type of network, the Kolmogorov-Arnold Network (KAN), that proposes some very promising solutions to these limitations.
The Limitations of MLPs
The Curse of Dimensionality
MLPs owe their success in part to the Universal Approximation Theorem, which states that a feedforward neural network with a single hidden layer containing a finite number of neurons can approximate any continuous function. However, MLPs suffer from what is known as the curse of dimensionality. This means that the more features you add to your model, the sparser your data becomes, reducing the accuracy of your model. Imagine a 2D graph where data points are relatively close together - if you fit a curve to these data points, the curve will be a good approximation of the relationship. However, if you then spread the data out over a third dimension without adding any new data points, the data points become further apart, causing the curve to stretch over long distances between data points. This problem only gets worse as you add more dimensions. To address the curse of dimensionality, current techniques often involve reducing the number of variables in your model by removing or combining features, but this approach is fragile.
Interpretability
Another major limitation of MLPs is their lack of interpretability. MLPs represent functions in a non-symbolic way; they donāt explicitly define the relationships between variables, but rather train a large number of weights to implicitly capture these relationships by repeatedly computing weighted sums and feeding results through activation functions. As a result, we canāt extract the exact equations that were executed for a given input to achieve an observed output. This severely limits their adoption for many purposes - how can we rely on outputs that we canāt fully explain?
Parameters and Energy Consumption
The best MLP models currently consist of an enormous number of parameters. Generally speaking, the more parameters a model has, the more computational resources are required for training and inference, and this computation translates to higher energy consumption. For instance, ChatGPT has been reported to use half a million kilowatt-hours daily to handle all user requests, equivalent to what is used by nearly 180,000 U.S households. To make these models more sustainable, weāll need to scale down the parameter count.
KANs as a Solution
KANs may propose a robust solution to each of the above problems.
The Curse of Dimensionality
KANs utilise the Kolmogorov-Arnold Representation Theorem, which states that any multivariate continuous function can be represented as a finite sum of continuous single-variable functions.
This means that we donāt need to model functions in high-dimensional vector space but can convert them into single-variable functions (think of a 2D plot). While the theorem dates back to 1956, and shallow two-layer KANs have been tried before (resulting functions were often non-smooth or even fractal), the key advancement of this paper is its use of this theorem in deep neural networks. By adding layers to the network, we can obtain a highly accurate model, generating smooth functions while avoiding the curse of dimensionality.
Interpretability
KANs model exact symbolic formulas which can be extracted and interpreted. If we obtain an output from some inputs, we can inspect all of the equations that were executed in the process. KANs facilitate this by placing learnable non-linear activation functions on each edge of the network that can be trained using piecewise polynomial curves known as B-splines.
Parameters and Energy Consumption
KANs have demonstrated remarkable performance on toy datasets, significantly outperforming MLPs while using far fewer parameters. This result challenges the common belief that scaling models is necessary to improve performance and may lead to the development of powerful models that consume far less energy.
Limitations of KANs
KANs are not without their own limitations, and many people remain skeptical of their potential. A key current limitation is their slow training speed. KANs arenāt GPU-efficient, so are much slower to train currently than MLPs. However, the authors noted that this may be an engineering problem rather than a fundamental shortcoming of KANs, and that future training tools and methods may resolve this.
The scalability of KANs is also yet to be determined. Currently, KANs have only been tested on small toy datasets, so it remains to be seen if KANs can perform as well as MLPs when working on larger and more complex problems.
Looking Ahead
While both their scalability and practical feasibility are still uncertain, the potential of KANs appears promising. If they can reliably solve in practice the problems they address in theory, their impacts may prove to be quite profound. Ziming Liu, the first author of the paper, also raises another deeper, slightly philosophical question that KANs may help address. That is, are the relationships we are approximating with neural networks symbolic in nature? Can fundamental principles and relationships in complex systems like vision and language be explained with elegant, simple, interpretable formulas? If so, KANs may help us find them.
Published by Jonas Macken, June 24 2024
The Impact of AI on EdTech: Shaping the Future of Learning
Source: UTS
Artificial Intelligence (AI) is transforming educational technology (EdTech) and has the potential to bridge the educational gap in Australia. The report "Shaping AI and EdTech to Tackle Australiaās Learning Divide" by Leslie Loble and Aurora Hawcroft explores how AI can enhance learning, especially for disadvantaged students.
The Seriousness of the Educational Chasm and How It Can Be Fixed
Australia faces a significant educational divide. Disadvantaged students in Year 3 are already two years and five months behind their advantaged peers, and this gap widens to over five years by Year 9. The COVID-19 pandemic has exacerbated these disparities, especially for students lacking adequate learning support. (CIRES & Mitchell Institute 2020, Goss & Sonneman 2020).
AI-driven EdTech offers a promising solution to close this gap by providing personalised, adaptive learning experiences that can cater to individual student needs and help teachers identify and support at-risk students more effectively.
Different Forms of Teacher-Oriented and Student-Oriented Technologies
AI in EdTech can be broadly categorised into teacher-oriented and student-oriented technologies:
Teacher-Oriented Technologies:
Smart Teaching Support Tools: These tools offer custom lesson plans, assessments, and teaching materials. They help teachers save time and improve lesson quality by providing resources aligned with curriculum standards.
Adaptive Assessment Systems: These systems adjust the difficulty of lessons based on student performance, as well as offering detailed insights into student learning needs and progress.
Diagnostic Tools: Early detection of learning difficulties like dyslexia can be achieved through AI.
Student-Oriented Technologies:
Intelligent Tutoring Systems: These systems provide personalised learning paths and immediate feedback, adapting to each studentās level of understanding and pace of learning.
Adaptive Learning Platforms: These platforms use AI to tailor educational content to individual student needs, helping them master subjects at their own pace.
Engagement Tools: AI can enhance student engagement by making learning interactive and enjoyable, thereby improving motivation and outcomes.
Evidence of Effectiveness
The report highlights a growing body of evidence showing that high-quality EdTech, when used effectively, can improve educational outcomes for disadvantaged students. For instance, AI-powered tutoring systems and adaptive learning platforms have demonstrated significant improvements in student achievement (CESE 2015b, pp. 2, 5). Real-world applications have shown that these technologies can provide personalised support, enhance student engagement, and offer valuable insights to teachers, leading to better-targeted interventions (Cheung & Slavin 2012).
Best Practices for Using AI in Education
The report recommends several best practices to ensure AI in EdTech is effective and equitable:
Quality and Governance: Implement strong regulatory frameworks to ensure AI tools are safe, effective, and used ethically.
Teacher Training: Provide professional development to help teachers integrate AI tools into their teaching practices effectively.
Targeted Support: Use AI to identify and support at-risk students, ensuring they receive the personalised help they need.
Collaboration and Feedback: Involve educators in the design and evaluation of AI tools to ensure they meet classroom needs and support teacher-led instruction.
Conclusion
AI has the potential to revolutionise education by making learning more personalised, accessible, and effective. By implementing AI-driven EdTech tools thoughtfully and with strong governance, Australia can bridge the educational divide and improve learning outcomes for all students, especially those who are disadvantaged.
Published by Lucy Lu, June 24 2024
Sponsors
Our ambitious projects would not be possible without the support of our GOLD sponsor, UNOVA.
Closing Notes
We welcome any feedback / suggestions for future editions here or email us at [email protected].
Stay curious,
Sources
Kolmogorov-Arnold Networks (KANs)
The Impact of AI on EdTech: Shaping the Future of Learning
CIRES & Mitchell Institute 2020, Impact of learning from home on educational outcomes for disadvantaged children, Victoria University
Cheung, A & Slavin, R 2012 āHow Features of Educational Technology Applications Affect Student Reading Outcomes: A Meta-analysisā. Educational Research Review, vol. 7, no. 3, pp. 198-215.
Centre for Education Statistics and Evaluation (CESE) 2015b, The effectiveness of tutoring interventions in mathematics for disadvantaged students, NSW Department of Education
Ramsay, P. (2022). Shaping AI and edtech to tackle Australiaās learning divide Leslie Loble AM. [online] doi:https://doi.org/10.57956/kxye-qd93.