Good data management of various aspects of work integrated learning (WIL) is important and may include, for example:
- Contact details of stakeholders, decision-makers, influencers, key role players and members of the relevant community of practice in general
- Established providers of experiential learning opportunities and mentors at such
- Registered students, including their academic progression
- Actual placements of students, including the provider and mentor details, and the learning agreement of each student
- Capturing of the learning activities of students
- Records of mentoring or supervision, especially if a professional or statutory requirement
- Records of monitoring and progress tracking of students, and of communication that occur
- Records of facilitated reflection on learning (experiences) by students
- Formative assessments and summative evaluation of WIL
Generally, data such as mentioned above, are captured in a range of data bases, such as the institution’s official student administration system, and data bases of academic and administrative staff responsible for WIL modules in a variety of forms. What is deemed needed is integration of the various forms of data, which is where ‘data mesh’ is deemed relevant.
Zhamak Dehghani, while a principal technology consultant at Thoughtworks, coined the term ‘data mesh’ in 2019, say Christ, Visengeriyeva and Harrer (n.d.), who indicate that ‘data mesh’ is primarily an organizational approach, which cannot be bought from a vendor. Yuhanna (2022) remarks that “A data mesh offers the ability to optimise mixed workloads by matching processing engines and data flows with the right use cases. It interfaces to the event-driven architecture, enabling support for edge use cases.”
Dehghani (2019) found that many organisations are planning or building their third generation data and intelligence platform hoping to democratize data in order to provide business insights with the view of automated intelligent decisions. Proprietary enterprise data warehouse and business intelligence platforms, the first generation, mainly rendered tables and reports which only a small group of specialized people could understand with little positive impact on the business. Complex big data ecosystem with a data lake, the second generation, required hyper-specialized data engineers and similarly impacted little on the actual business. Unfortunately, data platforms that are streaming and analysing real-time data, the third generation, with cloud based storage and machine learning platforms, are addressing only some of the gaps of the previous generations. Dehghani (2019) recommends shifting from the centralized paradigm of a data lake (or predecessor ware house) and embracing a paradigm (data mesh) drawing from contemporary “distributed architecture: considering domains as the first class concern, applying platform thinking to create self-serve data infrastructure, and treating data as a product”.
Dehghani (2020) differentiates operational data from analytical data. Operational data has a transactional nature with regard to the applications running the business; whereas analytical data provide an aggregated view of the facts of the business over time. Analyses are often modelled to provide retrospective or future-perspective insight. Often existing technology, architecture and organization design reflect the divergence of these two data planes; integrated yet separate, two levels of existence. The divergence is causing fragility, as well as a need for continuous extract, transform, and load (ETL) activities; and ever growing complexity of labyrinth of data pipeline between the two planes, as illustrated below (Dehghani, 2020, Figure 1: The great divide of data).
The ‘data mesh’ paradigm “recognizes and respects the differences between these two planes: the nature and topology of the data, the differing use cases, individual personas of data consumers, and ultimately their diverse access patterns” say Dehghani (2020). However, she adds, the ‘data mesh’ paradigm entails an inverted model and topology based on domains. ‘Data mesh’ “offers a more holistic and efficient way to organise, process, and access data” says Cynozure (2022), and “is an approach to how teams and logistics are organised”. Dehghani (2020), Cynozure (2022) and Christ et al. (n.d.) specify four underpinning principles:
- Domain-oriented decentralized data ownership and architecture — Splitting up the business goals into ideas and concepts within certain boundaries which smaller autonomous data teams can work on
- Data as a product — The different data domains are productised. Yuhanna (2022) indicates that data as a service (DaaS) is also known as data as a product (DaaP), and that it “offers several business benefits, including supporting a common view of business and customer data using industry standard protocols”.
- Creating self-serve data infrastructure — Dividing the different domains among smaller autonomous teams and making the self-serve data platforms
- Federated computational governance — Creating a unified governance and helping to come up with ways for data to flow from one team to another
Gavish (2022) derived three key concepts pertaining a ‘data mesh’ to highlight how it differs from traditional data architectures.
- Domain-oriented data owners and pipelines — who are held accountable for providing their data as products, while also facilitating communication between distributed data across different locations. Domains are further tasked with managing ingestion, cleaning, and aggregation.
- Self-serve functionality — allowing users to abstract the technical complexity and focus on their individual data use cases
- Interoperability and standardization of communications — some data (both raw sources and cleaned, transformed, and served data sets) will be valuable to more than one domain. cross-domain collaboration is enabled by standardisation on formatting, governance, discoverability, and metadata fields, among other data features. Each data domain further define and agree on SLAs and quality measures that they will ‘guarantee’ to its consumers.
It appears a ‘mesh’ connecting the various data sources pertaining the WIL of qualifications is necessary.
Christ, J.; Visengeriyeva, L. & Harrer, S. (n.d.) Data mesh architecture from an engineering perspective. Electronically accessible from https://t.co/HCzrguXmrg
Cynozure. (2022). Data mesh as a framework for data-driven value at scale. Hub & Spoken, Episode 127, 7 April 2022. Electronically accessible from https://t.co/x8qnxRPyt1
Dehghani, Z. (2019). How to move beyond a monolithic data lake to a distributed data mesh. Martin Fowler articles. Electronically accessible from https://martinfowler.com/articles/data-monolith-to-mesh.html
Dehghani, Z.. Data mesh principles and logical architecture. Martin Fowler articles. Electronically accessible from https://martinfowler.com/articles/data-mesh-principles.html
Gavish, L. (2022) What is a data mesh — and is it right for me? HackerNoon, 15 April 2022. Electronically accessible from https://t.co/oMiqO99kBg
Yuhanna, N. (2022). Four emerging data integration trends to assess: To accelerate their performance in data integration, companies are evaluating and adopting a range of contributing technologies. Computer Weekly, 8-14 February 2022, 20-23.
Comments
You can follow this conversation by subscribing to the comment feed for this post.