Fuyang's Blog

On the way to achieve minimalism and essentialism

Category: Engineering

Data Science and Machine Learning Essentials – Module1- Classification, Regression, Clustering and Recommendation

Classification and Regression

  • Classification and regression use data with known values to train a machine learning model so that it can identify unknown values for other data entities with similar attributes.
  • Classification is used to identify Boolean (True/False) values. Regression is used to identify real numeric values. So a question like “In this a chair?” is a classification problem, while “How much does this person earn?” is a regression problem.
  • Both classification and regression are examples of supervised learning, in which a machine learning model is trained using a set of existing, known data values. The basic principle is as follows:
  1. Define data entities based on a collection (or vector) of numeric variables (which we call features) and a single predictable value (which we call a label). In classification, the label has a value of -1 for False and +1 for True.
  2. Assemble a training set of data that contains many entities with known feature and label values – we call the set of feature values x and the label value y.
  3. Use the training set and a classification or regression algorithm to train a machine learning model to determine a function (which we call f) that operates on x to produce y.
  4. The trained model can now use the function f(x)=y to calculate the label (y) for new entities based on their feature values (x). In classification, if y is positive, the label is True; if y is negative, the label is False. The function can be visualized as a line on a chart, showing the predicted y value for a given value. The predicted values should be close to the actual known values for the training set, as shown in figure 1 below.
  5. You can evaluate the model by applying it to a set of test data with known label (y) values. The accuracy of the model can be validated by comparing the value calculated by f(x) with the known value for y, as shown in figure 2.

Figure 1: A trained model


Figure 2: A validated model

  • In a supervised learning model, the goal is to produce a function that accurately calculates the known label values for the training set, but which is also generalized enough to predict accurately for known values in a test set (and for unknown values in production data). Functions that only work accurately with the training data are referred to as “over-fitted”, and functions that are too general and don’t match the training data closely enough are referred to as “under-fitted”. In general, the best functions are ones that are complex enough to accurately reflect the overall trend of the training data, but which are simple enough to calculate accurate values for unknown labels.

Figure 3: An over-fitted model


Figure 4: An under-fitted model


  • Clustering is an unsupervised learning technique in which machine learning is used to group (or cluster) data entities based on similar features.
  • It is difficult to evaluate clustering models, because there is no training set with known values that you can use to train the model and compare its results.


  • Recommender systems are machine learning solutions that match individuals to items based on the preferences of other similar individuals, or other similar items that the individual is already known to like.
  • Recommendation is one of the most commonly used forms of machine learning.

Berkeley Electronic Interfaces Course – Capacitor revisited

We have previously talked about using a capacitor in our amplifier module. However we didn’t mention about the frequency behavior of this circuit. As we know, the voltage output of some RC circuit is frequency dependent. To make things easier to analysis, we need to talk about phasor – or phase vector first.

What is Phasor?

For now we just need to remember that phasor is introduced to simplify calculation. Phasor is a complex number representing a sinusoidal function whose amplitude (A), angular frequency (ω), and initial phase (θ) are time-invariant. Basically speaking it is brought to us by this way, as we all know:

v(t) = R \cdot i(t)

where v and i are the voltage and current of some current with resist R and they are time dependent variables. And now let’s for now just switch R as Z, which is called impedance, later you will see why it is convenient to do that here. For now we simply assume Z is something like R. (Or for resisters only, Z=R).

v(t) = Z \cdot i(t)

And, as we know those time depended v and i can simply be presented as a cos(ωt + θ) function, (or actually, you can use a group of an infinite number of cos function to linearly add up to form any time dependent wave function you may have), together with Euler’s equation, we got something look like this:

\Re \lbrace \mathbb{V} e^{j \omega t} \rbrace = \Re \lbrace Z \cdot \mathbb{I} e^{j \omega t} \rbrace

where phasor \mathbb{V} = V e^{j \theta} means the phasor is a combination of amplitude and initial phase. Big \Re means for taking the real number value of the complex value inside the brackets. Removing it together with the time dependent part we get:

\mathbb{V} = Z \cdot \mathbb{I}


So we can see in phasor world we can do simple calculations with phasors as if we used to do calculation with voltage, current and resistors. Noticing the R is actually resistance and X is reactance, which describe the energy storage characteristic in the system.

So how we can conclude now for resistor, capacitor and inductor, Z is represented as:

Z_R = R

Z_C = {{1} \over {j \omega C}}

Z_L = j \omega L

So now we have a group of tools to mathematically describe things easily, since the following rules are apply:

Components in serial Z_{eq} = Z_1 + Z_2 + Z_3 ...

Components in parallel {1 \over Z_{eq}} = {1 \over Z_{1}} + {1 \over Z_{2}} + {1 \over Z_{3}} ...

Example – RC circuit – Low Pass Filter


Now we see an example how to utilize the above math component. Consider the voltage Vc, it can be calculated as if the capacitor is like a resistor in the same position. (Note, later on big letters are all presented as complex numbers, or phasors.)

V_C = {Z_C \over {Z_C+Z_R}} V_S

V_C = {1 \over {1 + j \omega RC}} V_S


V_C = H \cdot V_S; H={1 \over {1 + j \omega RC}}


Notice that |H| is the magnitude and it changes with frequency. And when the frequency is low as zero, there is output on Vc, when frequency goes higher and higher, the output on Vc goes lower and lower. This is called a low pass filter. And we defined the so called cutoff frequency of the filter as when the magnitude of power reduce by half, or the magnitude of H reduce by around 0.707. And one can do some calculation to prove that when let

|H| = {1 \over \sqrt{2}} |H(0)|

one gets the cutoff frequency as:

\omega = {1 \over {RC}}

Example – RC circuit – High Pass Filter

Similarly, for the high pass filter, we have:

H = {j \omega RC \over j \omega RC + 1}

And the cutoff frequency as:

|H| = {1 \over \sqrt{2}} |H(\infty)|

\omega = {1 \over {RC}}


Experimenting with real circuit

So now let’s try built some RC circuit on our own and try to measure the H magnitude changes with frequency, also to see the whether the cutoff frequency are correct with our equation above.

We have two setup cases here, one with a different resistor but with the same capacitor, parameters as below:

Case R C Cutoff freq
1 2.7k Ohm 1uF 58.95 Hz
 2 300 Ohm 1uF 530.52 Hz

RC_exp_low_case1  BODE_1uf+2.7KOhm-LowPass

2.7k Ohm, 1 uF, fc=59 Hz

RC_exp_high_case1 BODE_1uf+2.7KOhm-HighPass

2.7k Ohm, 1 uF, fc=59 Hz

RC_exp_low_case2 BODE_1uf+300Ohm-LowPass  

300 Ohm, 1 uF, fc=59 Hz

RC_exp_high_case2 BODE_1uf+300Ohm-HighPass

300 Ohm, 1 uF, fc=59 Hz

Wala, the measured results fit so well with the theoretical calculation. I am happy 🙂

Berkeley Electronic Interfaces Course – Robot Module 4 – Controlling the motor (Inductors, Transistors and Diode)


Time to make our robot being able to jump and move around.

Transistor used for robot motor controlling: PN2222A npn transistor

PN2222A npn transistor, used in robot motor module

During this chapter of the course there are many deeper-in-level discussions on Inductors, Transistors, MOSFETS, BJTs, Diodes and so on. It is interesting to know things in a deeper level. However just to make things simpler here for the beginners like me, one could summarize the key knowledge learned during this session is about

  • How to switch things on and off by small signal input
  • Why a diode is needed for a inductor like load (or example, a motor)

Instead providing the course notes and videos, I found these two short (and funny) videos on Youtube showing all the info you need to assemble this module for the robot:

First: Transistor / MOSFET tutorial

Second: Inductive spiking, and how to fix it

Berkeley Electronic Interfaces Course – Robot Module 3.2 – Amplifier

3.2_GoldenRulesThe Golden Rules

Basically by using the golden rules shown above, one can approximately calculate or analysis the amplifier circuit to know the output gain of a specific amplifier setup.

Microphone Front End


Then let’s use the circuit designed above to make a microphone front end, together with an on board ADC we can verify the microphone readings via a on board LED. So the signal flow looks like this.


Bill of Materials:OPA2344PA

  • resistors: 2.7k, 10k (x3), 100k
  • electret microphone
  • ceramic disk capacitor (1 microfarad)
  • OPA2344 (or equivalent dual op amp chip)


Specially changed parts for my case due to the microphone I bought is not sensitive enough:

  • Rs change from 2.7K Ohm to 300 Ohm
  • Microphone is powered by 3.3V instead of 9V (it seems no difference for me but I changed it anyway hoping to save some power consumption)

Single-Supply Circuit 

Since we are working with a single battery voltage source (the op amp rails are connected to 3.3 V and ground, rather than +3.3 V and -3.3 V, we have to carefully consider our reference. The voltage divider at the op amp’s non-inverting input sets the circuit’s reference to 1.65 V.

In this way, an AC input with minus and plus, will have an output oscillating at 1.65V, up to 3.3V and down to 0V. This is called Single-Supply Circuit.

For our example, let’s set Rf = 100kOHM and Rs=2.7kOHM, Vcc=3.3V. Using the golden rule number 2 we know the DC voltage at both the inverting and non-inverting terminal of the op amp is Vcc/2. Also note because of the capacitor, the DC voltage at the conjunction of capacitor and Rs, (which is also the input of end of the amp) is Vcc/2 as well.

By using golden rule number 1 and do KCL at the node of inverting terminal of the amp, we get:

i_{R_s} = i_{R_f}3.2_MicrophoneFrontEnd

which means the current going to Rs equals Rf. By Ohm’s law we get:

{{V_{cc} / 2 + V_{in} - V_{cc} / 2} \over R_s }={ {V_{cc} / 2 - V_{out}} \over R_f }

V_{out} = {V_{cc} / 2 - {R_f \over R_s} V_{in}}

so in our case that

V_{out} = {1.65 - 37 V_{in}}

So that’s it, with the code loaded to MSP430G2 controller we can use the green LED to to show whether the microphone has heard something or not.

Course info:

EE40LX teaches the fundamentals of engineering electronic interfaces between the physical world and digital devices. Students can expect to cover the material of a traditional first circuits course with a project-based approach. We start with essential theory and develop an understanding of the building blocks of electronics as we analyze, design, and build different parts of a robot from scratch around a microcontroller. This course uses the Texas Instruments MSP430G2 LaunchPad, but you are welcome to bring whichever development board or microcontroller you like!”

// EE40LX
// Sketch 3.6
// Description; the MSP430 reads the output of the microphone circuit at P1.5 and
// decides whether or not to flash on an LED based on the sound level
// Tom Zajdel
// University of California, Berkeley
// July 27, 2015
// Version 1.1 July 27, 2015 - Added curly brackets to conditional statements
// - No longer declaring value in global scope

int MICINP = A5; // set MICINP as P1.5 alias
int GRNLED = P1_6; // set GRNLED as P1.6 alias

void setup()
 // start the serial monitor
 // set GRNLED as output pin
 Serial.println("Setup complete!");

void loop()
 int value; // declare variable value to store result of analogRead
 value = analogRead(MICINP); // get the voltage from the microphone
 Serial.println(value); // write digitized value to serial monitor

 if (value >= 515) // if digitized value is above 560, (here I changed to 515 to make it more sensitive)
 digitalWrite(GRNLED, HIGH);// turn on the LED...
 digitalWrite(GRNLED, LOW); // ...else turn off the LED
 delay(1); // delay in between reads for stability

In order to use analog input pins, we use the “AX” as an alias. All pins in Port 1 may be used as analog input pins (see http://energia.nu/Guide_MSP430LaunchPad.html for more information).

To read the voltage at this pin in the main loop, we use the analogRead() function, which converts the voltage to a 10-bit integer. We also print this value to the Serial monitor and use it to determine whether or not to turn on the green LED.

The value 560 corresponds to a voltage of 3.3 V \cdot {560 \over 1023} = 1.806 V. That is, the LED turns on whenever this voltage (the output of the microphone amplifier) exceeds 1.806 V.



Berkeley Electronic Interfaces Course – Robot Module 3.1 – Comparator

PowerBlocking_WsBrPower Blocking

Instead of powering the Wheatstone bridge with sensor all the time, one should use an output pin from MSP430G2 so that only a period of time the Wheatstone bridge is powered. It’s like taking measurement discontinuously, so to save lots of power consumption.

The way to do is as shown below, using the P1.1 on MSP430G2 as power output and connect it to the front rail, where all sensors are connected.


Now the Wheatstone bridge has been power blocked 🙂


Then we can attach the Wheatstone bridge sensor to the amplifier OPA2344. At the moment the amp work as a comparator, so that the small voltage different on the two Wheatstone output pin will be compared and the signal will be output after the comparator as +3.3V or 0V.OPA2344PA

As shown below that:

  • Firstly connect the amplifier power supply on the grade;
  • Then connect the Vout of Wheatstone bridge to the input pins of OPA2344;
  • Then connect the output of the amp to the pin P1.2 or P1.7 to control, as an example here, the on board red or green light to on and off.



There we go. Now if we boot the chip up then it is easy to test that when light up in front of the photocell, the LED on board will lights, and vise versa.



// EE40LX
// Sketch 3.2
// Description; Power-block a 3.3V rail at P1.1 and subsequently read inputs from
// Wheatstone bridges, connected to P1.2 and P1.7
// Tom Zajdel
// University of California, Berkeley
// July 27, 2015
// Version 1.2 July 27, 2015 - Added curly brackets to conditional statements
// Version 1.1 January 26, 2015 - Fixed a timing bug by using delayMicroseconds()
// and also corrected errors in pin assignment
//see pins_energia.h for more LED definitions

int PBRAIL = P1_1; // set PBRAIL as P1.1 alias
int LPHOTO = P1_2; // set LPHOTO as P1.2 alias
int RPHOTO = P1_7; // set RPHOTO as P1.7 alias

int REDLED = P1_0; // set REDLED as P1_0 alias
int GRNLED = P1_6; // set GRNLED as P1_6 alias

void setup()
 // set power block pin and led pins as outputs
 // set photocell input pins
 pinMode(LPHOTO, INPUT);
 pinMode(RPHOTO, INPUT);


void loop()
 digitalWrite(PBRAIL, HIGH); // supply 3.3V to the power rail
 delayMicroseconds(1000); // delay briefly to allow comparator outputs to settle

 if (digitalRead(LPHOTO) == HIGH) // if LPHOTO is on, turn REDLED on
 digitalWrite(REDLED, HIGH); // otherwise, turn REDLED off
 digitalWrite(REDLED, LOW); 
 if (digitalRead(RPHOTO) == HIGH) // if RPHOTO is on, turn GRNLED on
 digitalWrite(GRNLED, HIGH); // otherwise, turn GRNLED off
 digitalWrite(GRNLED, LOW); 
 digitalWrite(PBRAIL, LOW); // turn the power rail off again
 sleep(19); // wait 19 ms (can do other tasks in this time,
 // but we are simply demonstrating that you can cut power
 // to the circuits for 95% of the time and not notice!


Berkeley Electronic Interfaces Course – Robot Module 1 – Power supply


Recently I started to do an online course at edx. It is provided by UC Berkeley and the course is called Electronic Interfaces: Bridging the Digital and Physical Worlds. This course progressively builds on a bouncing robot. I found the teaching and the course content very good and interesting, thus I tried to purchase the parts needed online and now trying to build my own Robot.

And this, is the first module of the final Robot 🙂

In order to make the Robot brain, which is the LaunchPad M430G2, being able to work with a 9V battery, we need firstly to implement a voltage regulator on the breadboard. Basically speaking it takes in 9V and output 3.3V to match the voltage input M430G2.

LM1086 Voltage regulater

I try to use this blog to log the key information for each module so I can reference it later if I forget. I might later come back to add more info about why those capacitors are needed for this voltage regulator. As mentioned in the course the function of capacitors will be discussed in the later part of the course.


Transmission Loss Wiki Page Created


Finally, I have created this page for transmission loss on Wikipedia. Please help make it better if you are working with this concept frequently.

Here just to copy and paste some key content from Wiki page just for the convenience of myself 🙂

TL = L_{Wi} - L_{Wo} = 10 \log_{10} \left\vert {S_i p_{i+}^2 \over 2} {2\over S_o p_o^2 }\right\vert = 10 \log_{10} \left\vert {S_i p_{i+}^2 \over S_o p_o^2}\right\vert

Transmission loss (TL) (more specifically in duct acoustics) is defined as the difference between the power incident on a duct acoustic device (muffler) and that transmitted downstream into an anechoic termination. Transmission loss is independent of the source and presumes (or requires) an anechoic termination at the downstream end[1].

Transmission loss does not involve the source impedance and the radiation impedance inasmuch as it represents the difference between incident acoustic energy and that transmitted into an anechoic environment. Being made independent of the terminations, TL finds favor with researchers who are sometimes interested in finding the acoustic transmission behavior of an element or a set of elements in isolation of the terminations. But measurement of the incident wave in a standing wave acoustic field requires uses of impedance tube technology, may be quite laborious, unless one makes use of the two-microphone method with modern instrumentation.[1]

TL = 10 \log_{10}\left( {{1 \over 4} \left\vert { A + B {S \over \rho c} + C { \rho c \over S} + D }\right\vert^2}\right )

Using Iterative Solvers in COMSOL to Solve Large Acoustics Problems

The issue

So the problem finally arrives – when a COMSOL acoustic model gets very large and still we would like to perform analysis up to a relatively high frequency range (for exp. above 2000Hz), a grate number of mesh cells are needed in order to give a good representation of the system. Typically for our case, the mesh number went up to around 400K elements, and my desktop computer with 16G of RAM is simply not enough for the default direct MUMPS solver, as shown below.

Capture1_2 Capture2_2
So if you are new to COMSOL acoustics and encountered the same issue above, here is the “easy” solution: use a FGMRES iterative solver instead of the default direct solver, together setting a geometric multigrid as a preconditioner.

The cureIterative_solver_setup_1

  1. Expand Study, Solver Configurations, Solver 1 and Stationary Solver 1
  2. Right-click Stationary Solver 1 and choose Iterative. (Note the default Direct solver will become disabled afterwards.)
  3. Select the newly generated Iterative 1 solver, in the setting window, General section, select Solver: FGMRES.Iterative_solver_setup_2
  4. Right-click Iterative 1 solver and choose Multigrid. (Note thedefault Incomplete LU node will become disabled afterwards.)Iterative_solver_setup_3

After this, the calculation can be carried out smoothly. Honestly there are lots of new jargon such as GMRES and FGMRES which are new to me as well in the process and it seems a bit confusing. But if you just simply setup up the model as mentioned above, it will probably work for you as well.


However after going through the help documentation perhaps those following tips under are worth paying attention to.

More tips one may need

  • Use GMRES as a smoother only if necessary because GMRES smoothing is very time- and memory-consuming on fine meshes, especially for many smoothing steps. (Or in other words, use FGMRES when you are not sure about what you are doing.)
  • Try to use as many multigrid levels as needed to produce a coarse mesh for which a direct method can solve the problem without using a substantial amount of memory.
  • If the coarse mesh is still too fine for a direct solver, try using an iterative solver with 5–10 iterations as coarse solver.

Above words are copied from COMSOL Help Document. More details are found in the Acoustics Module User’s Guide under the Modeling with the Acoustics Module chapter. Under Fundamentals of Acoustics Modeling section, see Solving Large Acoustics Problems Using Iterative Solvers.

Also, there is a Test Bench Car Interior model in the COMSOL model library you can refer to. But it generally illustrates the same setting suggested above.


In the end, by simply a few clicks here and there, the big mesh issue was easily solved. RAM usage from 8G plus reduced to around 2G and it solves fast as well. Simply amazing!


List of abbreviations

MUMPS  – Multifrontal massively parallel sparse direct solver.
GMRES  – Generalized Minimum RESidual  iterative method.
FGMRES – Flexible Generalized Minimum RESidual iterative method.

And special thanks to the help from COMSOL by Mads J.H. Jensen and Kristian R. Jensen for their support on this topic.

A Geometry Trap for SolidWorks FlowSimulation

Geometry check results - Invalid contacts and invalid parts

Geometry check results – Invalid contacts and invalid parts

I have been using SolidWorks Flow Simulation for sometime, and usually, in most cases, it works just fine. However this week, I encountered this small geometry issue which seemingly brought me a big headache.

The first geometry trap (and yeah, there is a second trap blow, just keep reading), as the plot shows above, is making the Flow Simulation cannot find the correct flow domain geometries. By using the Check-Geometry tool provided by the Flow Simulation software, it gives errors called “Invalid contacts” and “Invalid part”.(One will see this is not a leakage issue, if he or she continues reading on.)

This is somehow a little bit strange because the issue is within a sub-assembly, which has been successful  through the air-tight-check on the simple tool FloXpress, done by my colleague. But when I include this sub-assembly in to the top-assembly, the above errors show up while doing geometry check and no flow domain can be obtained… Anyway, all the problems are located in the part connection shown as below:

Overview 1

Problem parts


Note the problem happens on this internal geometry, which also indicates that the previous issue is not a leakage issue, but a geometry error, or invalid geometry. If one make a cut plot and zoom in to the connection where the pipes meet the baffle, one will see some of the intersection parts look like things below (the two figures above):


The idea of drawing the baffle which should be perfectly contact with the pipe is: firstly draw a block with a perfect round top surface (A) which has a diameter exactly as pipe diameter plus shell thickness (above lower figures); secondly make a shell operation through surface (B) and (A).

The reason those parts afterwards don’t contact well have showing up miss-alignments is simply because, some part of the edge of round surface A is connected to some fillets.

Look at the upper pictures on the right side, since there is a fillet surface marked as blue color, when adding a shell performance, the thickness is probably calculated based on the norm direction on the fillet surface thus the shell body will be slightly bent as shown in the red line. This will give a miss-alignment in the model after adding the pipe in.

So in short, the first geometry trap can be concluded as: be special caution while doing a shell performance on a surface which is connected to some fillet or other tilted surfaces.


Initially, after we saw the above miss-alignment, we decided a quick fix by using the “cavity” tool, simply cut the intersecting parts out from the baffle by using the pipe body.

This solves the problem of “flow geometry not found”. But there is still error or warning messages on some invalid contacts. We didn’t pay attention to that and just went on with the simulation however the following picture is what we got after we started running the simulation:



Yes, you saw it, a software crash or dead…


So here is the second geometry trap – the real trapIf one does not solve ALL the “Invalid contact” or “Invalid parts”, found by the flow simulation geometry check tool, even though one can get the flow domain and proceed with simulation, one may later suffer simulation/software frozen issue…

Also one thing worth mentioning is that, while those invalid contacts exist, it is extremely lagging or time consuming while doing some normal simulation setup operations and geometry check operations. So if you suddenly feel a new model is slow than usual while setting up, there probably might exist those “invalid” stuff in your model. I suggest you fix them first before going any further.

CorrectModelI am not sure whether the second trap can happen in all circumstances but in our case, later on we simply changed the baffle part and make sure the fillet is ended before it touches the boundary of the round surface which is used for guide a shell operation, all problem solved.

So if you have read so far, here is a post written by Christopher Ma –  4 Things to Do Before Every Flow Simulation Analysis. I believe if one follows the process described within this article, make sure all the invalid things are fixed, the chance of software/simulation crash can be minimized.

Thank you for reading so far, hope this article can give you some hints and as always, do have fun with simulations 🙂