AI Safety And Regulations Are Two Sides of the Same Coin

I believe the notion of AI regulation [1] [2] is closely related to AI Safety and Trust as a whole, since it aims to address the accountability aspect on top of the explainability concepts addressed, as well as towards ensuring trustworthiness. Just like the public sector checklist, there have been frequent and common comparisons of AI towards the airline industry, which underpin the critical focus of converging towards long-term reliable and stable systems based on the fundamentals of accountability, verifiability, and transparency.

Regarding the concept of shared mental models, while some argue it may be more abstract but I think it is a blue sea of opportunities with various directions worth considering. In particular, there is room to build on existing research ideas on attributing various AI agents towards a common mental model and interact in various environments [3] [4] and in multi-agent contexts [6]. More prominently, in cognitive analysis of modern AI systems too which reflect their inabilities and limitations in such aspects [7]. In particular, existing research also extends the idea of a shared mental model between agents and humans towards all agents and humans to form a general world model, potentially serving as a means towards ascertained AI Safety [8].

More noteworthy, there is many research directions and opportunities in the aspect of value alignment. On one end, there is potential in investigating how social and individual values can be encoded through norm-based systems in societal context [9]. Also, one broader question could be how can we ensure that the defined metrics are indeed representative and able to capture the intended values - and to that, various literates have argued inabilities of existing AI systems in doing so, particularly through existing preference-based frameworks or value-as-texts systems [10] [11]. Yet, while these critiques have been brought up, this is still an open research question with much room to explore especially towards capturing these more abstract, thicker values at the core of society [10].

So, where can we go from here?

The following two directions are proposed, focusing on bridging AI alignment concepts with practical implementation and investigating the complex dynamics of Human-AI decision-making.

Direction 1: Advancing Values Alignment

Future research in Values Alignment should focus on two complementary paths to enhance the practical and conceptual maturity of existing work. The first path involves extending current methodologies to yield more concrete, implementable suggestions for integrating ethical and social values into existing AI architectures [11]. The second path seeks to conceptualize and operationalize abstract principles derived from foundational studies [8][10], transforming vague or high-level concepts into measurable and practical mechanisms that can be integrated into AI systems.

Direction 2: Interdisciplinary Investigation of Human-AI Systems

This direction targets the interdisciplinary domain of Human-AI decision-making by exploring systemic limitations within modern AI. This investigation can proceed through two main avenues: either by employing cognitive analysis techniques to probe the constraints and biases inherent in existing AI systems [7], or by examining the dynamics and overall effectiveness of integrated Human-AI teams and systems [12].




Enjoy Reading This Article?

Here are some more articles you might like to read next:

  • Bayesian Hide-And-Seek, Anyone?
  • The Illusion of Alignment - Are We Capturing What We Sought Out To?
  • Is Transformers AGI's Occum Razor? I'm Not Convinced, Yet
  • A Case for AI Controls
  • My Humble Thoughts - Collaboration is What Singapore Needs