材料科學主要研究材料的結構與性質(zhì)之間的關系,這些關系跨越從原子到微米尺度。掃描透射電子顯微鏡(STEM)已成為在這些尺度上研究材料的重要工具,特別是由于其能夠與先進的數(shù)據(jù)分析技術相結合,為自動化實驗和多維數(shù)據(jù)處理提供了新的機遇。

Fig. 1 Example of the errors introduced via StyleGAN116.
隨著機器學習算法的發(fā)展,STEM在實時分析和自動控制方面的應用前景廣闊。由美國田納西大學材料科學與工程系的Sergei V. Kalinin教授和橡樹嶺國家實驗室計算科學與工程部的Debangshu Mukherjee博士領導的團隊,對掃描透射電子顯微鏡中的自動化實驗機器學習進行了綜述。

Fig. 2 Automated task-based statistical analysis via few-shot learning.
掃描透射電子顯微鏡及其光譜技術已經(jīng)成為現(xiàn)代材料科學、凝聚態(tài)物理、化學和生物學等領域的基石工具。這項技術的影響力與其能夠洞察材料結構和性質(zhì)的量化信息量直接相關。無論是冷凍電子顯微鏡(Cryo EM)還是小晶體電子衍射等領域的突破都表明,數(shù)據(jù)分析方法和高效的操作流程極大地提高了從技術發(fā)展中所獲得的價值,并顯現(xiàn)出該領域的巨大增長潛力。
在STEM領域,自動化實驗的發(fā)展是一個快速崛起的趨勢。目前,正從人工實驗向自動化實驗的過渡期中,面臨著眾多挑戰(zhàn)。在儀器方面,需要發(fā)展高級別超語言,以便用最基礎的操作單元描述人類動作。在機器學習領域,這要求開發(fā)出對分布外漂移效應具有魯棒性的監(jiān)督學習算法,以及能夠在少量數(shù)據(jù)上訓練的主動學習技術。

在計算和網(wǎng)絡領域,這需要構建邊緣計算基礎設施,不僅能夠支持快速分析和決策,還能將儀器接入全球云網(wǎng)絡。這一點將進一步推動高效的數(shù)據(jù)與代碼共享,形成分布式的人機協(xié)作團隊,并催生出跨儀器的網(wǎng)絡協(xié)作平臺。

盡管如此,向自動化實驗的轉變同樣要求科學界在規(guī)劃與實施實驗活動方面作出根本性的改變。目前為止,已知的所有顯微鏡自動化實驗均采用基于固定策略和預先定義的興趣對象的工作流。僅有的超越傳統(tǒng)人工操作流程的實驗案例,是那些基于深度核學習的逆向發(fā)現(xiàn)實驗。

要真正釋放自動化實驗的潛能,關鍵在于明確定義實驗激勵,即明確的實驗目標,這可以是探索性發(fā)現(xiàn)、假設驗證或定量測量等。許多這樣的激勵目標通常只在特定領域應用的更寬廣的科學背景中才能被界定。接下來,需要制定確定性或概率性的策略,即將以超語言表達的具體行動與系統(tǒng)的當前狀態(tài)(圖像或光譜)連接的算法。這些策略可以在實驗前設定,以協(xié)調(diào)探索和利用之間的目標,或者更引人注目的是,策略可以隨著實驗的進行而不斷發(fā)展,以便在既定實驗預算內(nèi)實現(xiàn)既定的獎勵目標。

綜上所述,STEM領域的自動化實驗(AE)雖處于起步階段,但變化迅猛。鑒于基于Python的API和云基礎設施、遠程控制的顯微鏡的迅速發(fā)展,尤其是考慮到貝葉斯優(yōu)化、強化學習以及其他隨機優(yōu)化形式等主動學習方法的最新進展,可以預見該領域將在未來幾年內(nèi)迎來快速增長。
該文近期發(fā)表于npj Computational Materials 9: 227 (2023).

Editorial Summary
Materials science focuses on the study of the relationships between the structure and properties of materials that span from the atomic to the micrometer scale. Scanning transmission electron microscopy (STEM) has become an important tool for studying materials at these scales, especially due to its ability to be combined with advanced data analysis techniques, which provide new opportunities for automated experiments and multidimensional data processing. With the development of machine learning algorithms, STEM has promising applications in real-time analysis and automation.?
A team lead by Prof. Sergei V. Kalinin from Department of Materials Science and Engineering, University of Tennessee and Dr. Debangshu Mukherjee from Computational Sciences and Engineering Division, Oak Ridge National Laboratory, USA, reviewed machine learning for automated experimentation in scanning transmission electron microscopy. Scanning transmission electron microscopy and spectroscopy has become one of the foundational tools in modern materials science, condensed matter physics, chemistry, and biology. The impact of this technique is directly related to the amounts of quantifiable information on materials structure and properties it can derive. The success of fields such as Cryo EM and small crystal electron crystallography suggest that the availability of the data analysis methods and operational workflows greatly amplifies the value derived from technique developments and suggests tremendous potential for the field growth. One of the rapidly emerging trends in STEM is the development of the automated experiments.?
Here, the authors overview some of the challenges that transition from human-driven to automated experiment EM will bring. On the instrument side, this necessitates the development of the instrument-level hyper-languages that allow to represent the human operations via minimal primitives. On the ML side, it requires development of the supervised ML algorithms that are stable with respect to the out of distribution drift effects and active learning methods that can be trained on small volumes of data. On the computational and network side, it requires development of edge computing infrastructure capable of supporting rapid analysis and decision making, and connect the instrument to the global cloud. The latter in tern opens the pathway to the effective data and code sharing, formation of the distributed human-ML teams, and emergence of the lateral instrumental networks. However, the transition to the automated experiments also requires deep changes in the way scientific community plans and executes experimental activities. To date, all examples of the automated experiment in microscopy the authors are aware of are performed with the workflows based on fixed policies and a priori known objects of interest. The only examples of beyond human workflows include the inverse discovery experiments based on the deep kernel learning. Going beyond simple imitation of human operation and unleashing the power of automated experiment requires clearly defining the experimental reward, i.e. specific goals. This can include the discovery (curiosity learning), hypothesis falsification, or quantitative measurements. Many of these rewards are defined only within a broader scientific context of specific domain applications. Secondly, this requires formulating the deterministic or probabilistic policies, i.e. algorithms connecting the specific action expressed in the hyper language and the observed state of the system (image or spectra). These policies can be defined prior to the experiment to balance the exploration and exploitation goals. Alternatively, and much more interestingly, the policies can evolve along the experiment to achieve the desired reward within the given experimental budget.?
Overall, the current state of the AE in STEM is nascent but fast changing. However, given the rapid emergence of the Python-based APIs and cloud infrastructure, remotely controlled microscopes, and especially given recent advances in active learning methods including Bayesian Optimization, reinforcement learning, and other forms of stochastic optimization, this field is likely to grow quickly in the coming years.
This review article was recently published in npj Computational Materials 9: 227 (2023).
原文Abstract及其翻譯
Machine learning for automated experimentation in scanning transmission electron microscopy(機器學習在掃描透射電子顯微鏡自動實驗中的應用)
Sergei V. Kalinin,?Debangshu Mukherjee,?Kevin Roccapriore,?Benjamin J. Blaiszik,?Ayana Ghosh,?Maxim A. Ziatdinov,?Anees Al-Najjar,?Christina Doty,?Sarah Akers,?Nageswara S. Rao,?Joshua C. Agar?&?Steven R. Spurgeon?
Abstract?Machine learning (ML) has become critical for post-acquisition data analysis in (scanning) transmission electron microscopy, (S)TEM, imaging and spectroscopy. An emerging trend is the transition to real-time analysis and closed-loop microscope operation. The effective use of ML in electron microscopy now requires the development of strategies for microscopy-centric experiment workflow design and optimization. Here, we discuss the associated challenges with the transition to active ML, including sequential data analysis and out-of-distribution drift effects, the requirements for edge operation, local and cloud data storage, and theory in the loop operations. Specifically, we discuss the relative contributions of human scientists and ML agents in the ideation, orchestration, and execution of experimental workflows, as well as the need to develop universal hyper languages that can apply across multiple platforms. These considerations will collectively inform the operationalization of ML in next-generation experimentation.
摘要 機器學習(ML)已成為(掃描)透射電子顯微鏡(S)TEM成像和光譜數(shù)據(jù)后期處理的關鍵技術。目前的一個新趨勢是向實時分析和閉環(huán)顯微鏡操作的過渡。在電子顯微鏡中有效利用機器學習現(xiàn)在需要開發(fā)以顯微鏡為中心的實驗工作流程設計和優(yōu)化策略。在這里,我們討論了向主動機器學習過渡所面臨的挑戰(zhàn),包括順序數(shù)據(jù)分析、分布外漂移效應、邊緣運算要求、本地和云數(shù)據(jù)存儲,以及環(huán)路理論操作。特別是,我們討論了人類科學家和機器學習代理在實驗工作流程的構思、協(xié)調(diào)和執(zhí)行中的相對貢獻,以及開發(fā)可跨多個平臺應用的通用超級語言的必要性。這些考慮將共同影響機器學習在下一代實驗中的操作化。
原創(chuàng)文章,作者:計算搬磚工程師,如若轉載,請注明來源華算科技,注明出處:http://www.zzhhcy.com/index.php/2024/01/07/fff2cc93b1/