By empowering engineers to reproduce detailed natural processes, computer simulation is transforming the design, analysis, and manufacture of industrial engineering practices. Despite its significant success, one persistent question bothers both analysts and decision-makers:
how good are these simulations exactly?
Uncertainty quantification, which stands at the confluence of probability, statistics, computational mathematics, and disciplinary sciences, provides a promising framework to answer that question and has gathered tremendous momentum in recent years. In this article, we will discuss the following aspects of uncertainty quantification:
Making reliable model-based predictions is not always an easy task.
In generic form, we have a model f(.) with some model parameter θ. This model should simulate some real-life processes. Then, given input x, we can use the model to predict, which leads us to y.
In regression analysis, labeling the training samples is usually time-consuming and takes a large portion of the computational budget.
In our engineering team, we face this kind of challenge all the time.
Our task is to design aero-engine components. We constantly need to train regression models to predict product performance, given the design parameters. For us, labeling a training sample requires conducting high-fidelity physics simulations, which could easily take up to days, even weeks, to run on a cluster.
Obviously, if the model needs many samples to reach satisfactory accuracy, the resulting computational burden would be a nightmare.
Luckily, we found a simple solution to dramatically cut down the number of training samples while maintaining the prediction accuracy. This solution is active learning. …
Animations are great data visualization tools to convey complicated insights engagingly. In this article, we will walk through the steps of creating an animation with Matplotlib
and Celluloid
.
This is what we will make: it simulates various projectile motion trajectories and updates the associated histogram of the projectile shooting range.
This tutorial goes as follows. After introducing the animation packages and the employed data, we dive straight into creating the animation. We go from simple to complex: to start, we animate a single trajectory, followed by animating multiple trajectories. …
For data scientists, effectively communicating the uncertainty of our analysis to stakeholders is crucial for reliable decision-making.
I’ve been working on uncertainty quantification analysis for some time. From numerous presentations I made for various audiences, I learned that despite the usual uncertainty visualization techniques such as box plots, violin plots, confidence bands, etc., are compact and precise in displaying uncertainty, they may only resonate among trained statisticians.
For broader audiences, including stakeholders, domain experts, etc., …
Want to deliver faster optimization for time-consuming objective functions? Consider surrogate optimization.
In engineering, a product optimization analysis involves finding the design parameter combination such that the objective function, e.g., product performance, manufacturing costs, etc., is maximized/minimized globally across the design parameter space.
The optimization routine requires many iterations to locate the global optimum. Within each iteration, high-fidelity, time-consuming computer simulations are usually conducted to evaluate the current design's objective function. Consequently, the whole optimization process could easily take up to days or even weeks if the objective function is complex.
In our engineering team, we have implemented an optimization strategy called surrogate optimization to battle this issue. The results were quite impressive: we could easily achieve around 50-fold increases in optimization speed! …
In this post, we will review some of the most useful probability concepts applied to multiple random variables. In particular, I will show you how those concepts are logically connected.
An Intuition for each concept is built first before discussing the math. That’s why I’ve created many illustrations to facilitate visually explaining abstract concepts. I’ve also included links for related topics this post didn’t cover. A linkable table of content is provided below, in case you want to jump into specific topics directly. I hope those efforts could offer you a better reading experience.
If you want to refresh your knowledge on concepts like random variable, probability density function, expectation, or variance, feel free to check out my previous…
In this post, we will review some of the most useful probability concepts and important tools for exploratory data analysis. The table of content is given in the mind map above. Throughout the post, I have created many illustrations to explain abstract concepts visually. Meanwhile, I’ve used a dataset to show how the probability concepts are applied in practice. I hope those efforts could offer you a better reading experience.
This post is all about understanding the underlying probability theories. …
In part I of this series, we’ve introduced the fundamental concepts of surrogate modeling. In part II, we’ve seen surrogate modeling in action through a case study that presented the full analysis pipeline.
To recap, the surrogate modeling technique trains a cheap yet accurate statistical model to serve as the surrogate for the computationally expensive simulations, thus significantly improving the efficiency of the product design and analyses.
In part III, we will briefly discuss the following three trends emerged in surrogate modeling research and application:
In part I of this series, we’ve introduced the idea of using surrogate models to accelerate simulation-based product design processes. This is achieved by training a statistical model to serve as a cheap yet accurate surrogate to the simulations in performing various design tasks, therefore significantly improving the analysis efficiency.
In part II, we will go through a case study to demonstrate how to use surrogate models in practice. The roadmap for this case study is shown below: