Fuyang's Blog

On the way to achieve minimalism and essentialism

Data Science and Machine Learning Essentials – Module1- Regression

Simple Linear Regression

  • When operating on the data in the training and test data-sets, the difference between the known y value and the value calculated by f(x) must be minimized. In other words, for a single entity, y – f(x) should be close to zero. This is controlled by the values of any baseline variables (b) used in the function.
  • For various reasons, computers work better with smooth functions than with absolute values, so the values are squared. In other words, (y – f(x))2 should be close to zero.
  • To apply this rule to all rows in the training or test data, the squared values are summed over the whole dataset, so ∑(yi – f(xi))2 should be close to zero. This quantity is known as the sum of squares errors (or SSE). The regression algorithm minimizes this by adjusting the values of baseline variables (b) used in the function.
  • In statistics, the residual sum of squares (RSS), also known as the sum of squared residuals (SSR) or the sum of squared errors of prediction (SSE).


Ridge Regression

  • Minimizing the SSE for simple linear regression models (where there is only a single x value) generally works well, but when there are multiple x values, it can lead to over-fitting. To avoid this, the you can use ridge regression, in which the regression algorithm adds a regularization term to the SSE. This helps achieve a balance of accuracy and simplicity in the function.

Ridge Regression

  • Note those two terms up there:
    1. Minimizing the first term, is just asking the computer to keep predicting the truth based on the training set.
    2. Minimizing the second term, is like asking to keep the model “simple” – Principle of Occam’s Razor.
  • Support vector machine regression is an alternative to ridge regression that uses a similar technique to minimize the difference between the predicted f(x) values and the known y values.

Nested Cross-Validation

  • To determine the best algorithm for a specific set of data, you can use cross-validation (CV), in which the data is divided into folds, with each fold in turn being used to test algorithms that are trained on the other folds.
  • To compare algorithms, the algorithms that performed the best was the one with the best average out-of-sample performance across the 10 test folds. Also one can check the standard deviation of each folds.
  • To determine the optimal value for a parameter (k) for an algorithm, you can use nested cross-validation, in which one of the folds in the training set is used to validate all possible k values. This process is repeated so that each fold in the training set is used as a validation fold. The best result is then tested with the test fold, and the whole process is repeated again until every fold has been used as the test fold.

Nested cross-validation

Data Science and Machine Learning Essentials – Module1- Classification, Regression, Clustering and Recommendation

Classification and Regression

  • Classification and regression use data with known values to train a machine learning model so that it can identify unknown values for other data entities with similar attributes.
  • Classification is used to identify Boolean (True/False) values. Regression is used to identify real numeric values. So a question like “In this a chair?” is a classification problem, while “How much does this person earn?” is a regression problem.
  • Both classification and regression are examples of supervised learning, in which a machine learning model is trained using a set of existing, known data values. The basic principle is as follows:
  1. Define data entities based on a collection (or vector) of numeric variables (which we call features) and a single predictable value (which we call a label). In classification, the label has a value of -1 for False and +1 for True.
  2. Assemble a training set of data that contains many entities with known feature and label values – we call the set of feature values x and the label value y.
  3. Use the training set and a classification or regression algorithm to train a machine learning model to determine a function (which we call f) that operates on x to produce y.
  4. The trained model can now use the function f(x)=y to calculate the label (y) for new entities based on their feature values (x). In classification, if y is positive, the label is True; if y is negative, the label is False. The function can be visualized as a line on a chart, showing the predicted y value for a given value. The predicted values should be close to the actual known values for the training set, as shown in figure 1 below.
  5. You can evaluate the model by applying it to a set of test data with known label (y) values. The accuracy of the model can be validated by comparing the value calculated by f(x) with the known value for y, as shown in figure 2.

Figure 1: A trained model


Figure 2: A validated model

  • In a supervised learning model, the goal is to produce a function that accurately calculates the known label values for the training set, but which is also generalized enough to predict accurately for known values in a test set (and for unknown values in production data). Functions that only work accurately with the training data are referred to as “over-fitted”, and functions that are too general and don’t match the training data closely enough are referred to as “under-fitted”. In general, the best functions are ones that are complex enough to accurately reflect the overall trend of the training data, but which are simple enough to calculate accurate values for unknown labels.

Figure 3: An over-fitted model


Figure 4: An under-fitted model


  • Clustering is an unsupervised learning technique in which machine learning is used to group (or cluster) data entities based on similar features.
  • It is difficult to evaluate clustering models, because there is no training set with known values that you can use to train the model and compare its results.


  • Recommender systems are machine learning solutions that match individuals to items based on the preferences of other similar individuals, or other similar items that the individual is already known to like.
  • Recommendation is one of the most commonly used forms of machine learning.

First impression on Python

Just started learning Python for a couple of days, found it seem to be a elegant and powerful language.

For example, there’s a problem set of the MIT online course, to let you use for loop output a sequence of number like this:

10, 8, 6, 4, 2,

I know it is easy with basic programming handling method. But I was asking myself, since range(1,5) can give you:

1, 2, 3, 4,

then what is the shortest code to output an array like this:

4, 3, 2, 1,

So how? Tried something like range(5,1) or range (1, 5,-1) it didn’t work. (Later I found actually “range(4,0,-1)” will do)

Then I turned to google and found one can actually do this:


give output

4, 3, 2, 1

My impression on this was like:


This is brilliant and elegant, also perhaps very efficient. However, after looking more discussion on the stackoverflow page, I found an interesting thing for example:

import numpy as np

a = np.array(range(1,11))

b = a[:]

c = a[::-1]

print c

a[1] = 100

b[2] = 101

print c

and the output of it is:

[10 9 8 7 6 5 4 3 2 1]
[ 10 9 8 7 6 5 4 101 100 1]

which means c changes by not doing direct change on itself.

This is because “when you create c you are creating a view into the original array. You can then change the original array, and the view will update to reflect the changes,” as stated by someone on that stackoverflow page. This could be extremely useful, and it feels python give me back the freedom of easily using index or pointer-like functionalities, which could mean that one can probably write very elegant code to do efficient and memory saving operations, easily. It seems to have the potential.

Also, as someone from this stackoverflow post said:

I commonly work with >10GB 3D arrays of uint8’s, so I worry about this a lot… Numpy (seems to be a python math package) can be very efficient at memory management if you keep a few things in mind.

And he also mentioned other ways to save time and space such as how to avoid making a copy of an array and so on. I will need to go back to this post again sometime later, to get a better understanding of the language.

And so far my first impression on Python is like this:


Berkeley Electronic Interfaces Course – Capacitor revisited

We have previously talked about using a capacitor in our amplifier module. However we didn’t mention about the frequency behavior of this circuit. As we know, the voltage output of some RC circuit is frequency dependent. To make things easier to analysis, we need to talk about phasor – or phase vector first.

What is Phasor?

For now we just need to remember that phasor is introduced to simplify calculation. Phasor is a complex number representing a sinusoidal function whose amplitude (A), angular frequency (ω), and initial phase (θ) are time-invariant. Basically speaking it is brought to us by this way, as we all know:

v(t) = R \cdot i(t)

where v and i are the voltage and current of some current with resist R and they are time dependent variables. And now let’s for now just switch R as Z, which is called impedance, later you will see why it is convenient to do that here. For now we simply assume Z is something like R. (Or for resisters only, Z=R).

v(t) = Z \cdot i(t)

And, as we know those time depended v and i can simply be presented as a cos(ωt + θ) function, (or actually, you can use a group of an infinite number of cos function to linearly add up to form any time dependent wave function you may have), together with Euler’s equation, we got something look like this:

\Re \lbrace \mathbb{V} e^{j \omega t} \rbrace = \Re \lbrace Z \cdot \mathbb{I} e^{j \omega t} \rbrace

where phasor \mathbb{V} = V e^{j \theta} means the phasor is a combination of amplitude and initial phase. Big \Re means for taking the real number value of the complex value inside the brackets. Removing it together with the time dependent part we get:

\mathbb{V} = Z \cdot \mathbb{I}


So we can see in phasor world we can do simple calculations with phasors as if we used to do calculation with voltage, current and resistors. Noticing the R is actually resistance and X is reactance, which describe the energy storage characteristic in the system.

So how we can conclude now for resistor, capacitor and inductor, Z is represented as:

Z_R = R

Z_C = {{1} \over {j \omega C}}

Z_L = j \omega L

So now we have a group of tools to mathematically describe things easily, since the following rules are apply:

Components in serial Z_{eq} = Z_1 + Z_2 + Z_3 ...

Components in parallel {1 \over Z_{eq}} = {1 \over Z_{1}} + {1 \over Z_{2}} + {1 \over Z_{3}} ...

Example – RC circuit – Low Pass Filter


Now we see an example how to utilize the above math component. Consider the voltage Vc, it can be calculated as if the capacitor is like a resistor in the same position. (Note, later on big letters are all presented as complex numbers, or phasors.)

V_C = {Z_C \over {Z_C+Z_R}} V_S

V_C = {1 \over {1 + j \omega RC}} V_S


V_C = H \cdot V_S; H={1 \over {1 + j \omega RC}}


Notice that |H| is the magnitude and it changes with frequency. And when the frequency is low as zero, there is output on Vc, when frequency goes higher and higher, the output on Vc goes lower and lower. This is called a low pass filter. And we defined the so called cutoff frequency of the filter as when the magnitude of power reduce by half, or the magnitude of H reduce by around 0.707. And one can do some calculation to prove that when let

|H| = {1 \over \sqrt{2}} |H(0)|

one gets the cutoff frequency as:

\omega = {1 \over {RC}}

Example – RC circuit – High Pass Filter

Similarly, for the high pass filter, we have:

H = {j \omega RC \over j \omega RC + 1}

And the cutoff frequency as:

|H| = {1 \over \sqrt{2}} |H(\infty)|

\omega = {1 \over {RC}}


Experimenting with real circuit

So now let’s try built some RC circuit on our own and try to measure the H magnitude changes with frequency, also to see the whether the cutoff frequency are correct with our equation above.

We have two setup cases here, one with a different resistor but with the same capacitor, parameters as below:

Case R C Cutoff freq
1 2.7k Ohm 1uF 58.95 Hz
 2 300 Ohm 1uF 530.52 Hz

RC_exp_low_case1  BODE_1uf+2.7KOhm-LowPass

2.7k Ohm, 1 uF, fc=59 Hz

RC_exp_high_case1 BODE_1uf+2.7KOhm-HighPass

2.7k Ohm, 1 uF, fc=59 Hz

RC_exp_low_case2 BODE_1uf+300Ohm-LowPass  

300 Ohm, 1 uF, fc=59 Hz

RC_exp_high_case2 BODE_1uf+300Ohm-HighPass

300 Ohm, 1 uF, fc=59 Hz

Wala, the measured results fit so well with the theoretical calculation. I am happy 🙂

Berkeley Electronic Interfaces Course – Robot Module 4 – Controlling the motor (Inductors, Transistors and Diode)


Time to make our robot being able to jump and move around.

Transistor used for robot motor controlling: PN2222A npn transistor

PN2222A npn transistor, used in robot motor module

During this chapter of the course there are many deeper-in-level discussions on Inductors, Transistors, MOSFETS, BJTs, Diodes and so on. It is interesting to know things in a deeper level. However just to make things simpler here for the beginners like me, one could summarize the key knowledge learned during this session is about

  • How to switch things on and off by small signal input
  • Why a diode is needed for a inductor like load (or example, a motor)

Instead providing the course notes and videos, I found these two short (and funny) videos on Youtube showing all the info you need to assemble this module for the robot:

First: Transistor / MOSFET tutorial

Second: Inductive spiking, and how to fix it

Berkeley Electronic Interfaces Course – Robot Module 3.3 – Amplifier (Speaker Driver & Choosing Amp)

In the previous chapter, we made the robot can hear us. And now we try to let the robot be able to make some noise. Thus we will try to connect a speaker to it.

Suppose we have a small 8 ohm speaker needed to put on the robot. Apparently there are several ways to simply do that, we will start trying some simple ones and see if it works.

Method 1 – Direct drive by the MSP430


As discussed in the course that while doing a setup like this, due to the low impedance of the speaker, with 3.3V voltage supply from the board, it will theoretically drain around 412.5mA. That’s a huge a mount of current. However if we look at the MSP430 datasheet it says there’s only a 48mA maximum output on all the output pins. So when driving the speaker directly with MSP430, not only it will not be very loud, also it will make MSP won’t able to do anything else on any of the output pins due to the speaker is sucking up all the current.


  • Simple setup


  • Drain too much power or current from MSP430 board
  • Might not be able to drive the speak loud enough (48mA limit due to MSP430 spec)

One way to solve this is to use an amp and let the amp provide the power to the speaker. And here is a way to do it.

Method 2 – Non-Inverting Amplifier Drive (OPA2344)


By looking at the circuit we could calculate by using the 3 golden rules, we could know theoretically the output of the amp should be with a gain of 2, which is around 6.6V, and hopefully driving the speaker with 825mA. That should sound very loud. However when you test it, as they did in the course video, you will see that the speaker sounds no much difference. The reason is that for some low power amplifiers, they have a relatively large output impedance, or in other words, they have a relatively low short current – a current at output when you short the output to the ground. And this short current is the maximum current this amp can deliver to the load on output. On the OPA2344 datasheet it notes the short-circuit current is around 15mA. No more than our previous direct drive method.


  • Low power consumption (high impedance between inputs and low quiescent current 150uA)
  • No power drain from MSP430


  • Low short-current (15mA), output power low
  • Consume a little power at the grounded R1

So an easy way to solve some of the low output issue above is by using a type of amps especially designed to dive speakers, an amp with low output impedance and high short current.

Method 3 – Audio Amplifier Drive (LM380 Drive)


For example if we can use LM380-8 amp as a comparator to drive the speaker, shown above. This for sure can drive the speaker very loud with a short current of 1.3A. However there are some draw backs of this amp in our application due to it’s power consumption of the battery could be really high. This is due to the relatively low input impedance comparing with the previous OPA2344 amp. The quiescent current is 7mA for this amp and only 150uA for OPA2344. (As long as I remember the quiescent current is the current that flows between the non-inverting and inverting inputs, or it could be also described as the current an amp is “naturally” drawing without and load connected. The higher the input impedance, the lower of the quiescent current, thus lower power consumption.)


  • High short current 1.3A, powerful output
  • No power drain from MSP430


  • High power comsumption – Quiescent current typical 7mA

So what we do? Well practically we don’t need to make the speaker sound extremely loud in our robot case, so method 2 will more or less do the job. However by noticing the power drain on the R1, we then could just change the circuit without using feedback resisters, but using a direct feedback to so make a voltage follower.

Method 4 – Voltage Follower Drive


In this way the output voltage has no gain and we still can output 15mA mostly to the speaker and this sounds quite alright. Also in this way the power consumption is extremely low among other methods shown here.


  • Ultra low power consumption
  • No power drain from MSP430


  • Low output power, but for our case it is acceptable

I believer there are more methods out there and could be discussed to improve the design. But since the learning level of this course is not aiming at those advanced topics and the method 4 will just do fine for our robot, we just decide to go on with it now 🙂

Berkeley Electronic Interfaces Course – Robot Module 3.2 – Amplifier

3.2_GoldenRulesThe Golden Rules

Basically by using the golden rules shown above, one can approximately calculate or analysis the amplifier circuit to know the output gain of a specific amplifier setup.

Microphone Front End


Then let’s use the circuit designed above to make a microphone front end, together with an on board ADC we can verify the microphone readings via a on board LED. So the signal flow looks like this.


Bill of Materials:OPA2344PA

  • resistors: 2.7k, 10k (x3), 100k
  • electret microphone
  • ceramic disk capacitor (1 microfarad)
  • OPA2344 (or equivalent dual op amp chip)


Specially changed parts for my case due to the microphone I bought is not sensitive enough:

  • Rs change from 2.7K Ohm to 300 Ohm
  • Microphone is powered by 3.3V instead of 9V (it seems no difference for me but I changed it anyway hoping to save some power consumption)

Single-Supply Circuit 

Since we are working with a single battery voltage source (the op amp rails are connected to 3.3 V and ground, rather than +3.3 V and -3.3 V, we have to carefully consider our reference. The voltage divider at the op amp’s non-inverting input sets the circuit’s reference to 1.65 V.

In this way, an AC input with minus and plus, will have an output oscillating at 1.65V, up to 3.3V and down to 0V. This is called Single-Supply Circuit.

For our example, let’s set Rf = 100kOHM and Rs=2.7kOHM, Vcc=3.3V. Using the golden rule number 2 we know the DC voltage at both the inverting and non-inverting terminal of the op amp is Vcc/2. Also note because of the capacitor, the DC voltage at the conjunction of capacitor and Rs, (which is also the input of end of the amp) is Vcc/2 as well.

By using golden rule number 1 and do KCL at the node of inverting terminal of the amp, we get:

i_{R_s} = i_{R_f}3.2_MicrophoneFrontEnd

which means the current going to Rs equals Rf. By Ohm’s law we get:

{{V_{cc} / 2 + V_{in} - V_{cc} / 2} \over R_s }={ {V_{cc} / 2 - V_{out}} \over R_f }

V_{out} = {V_{cc} / 2 - {R_f \over R_s} V_{in}}

so in our case that

V_{out} = {1.65 - 37 V_{in}}

So that’s it, with the code loaded to MSP430G2 controller we can use the green LED to to show whether the microphone has heard something or not.

Course info:

EE40LX teaches the fundamentals of engineering electronic interfaces between the physical world and digital devices. Students can expect to cover the material of a traditional first circuits course with a project-based approach. We start with essential theory and develop an understanding of the building blocks of electronics as we analyze, design, and build different parts of a robot from scratch around a microcontroller. This course uses the Texas Instruments MSP430G2 LaunchPad, but you are welcome to bring whichever development board or microcontroller you like!”

// EE40LX
// Sketch 3.6
// Description; the MSP430 reads the output of the microphone circuit at P1.5 and
// decides whether or not to flash on an LED based on the sound level
// Tom Zajdel
// University of California, Berkeley
// July 27, 2015
// Version 1.1 July 27, 2015 - Added curly brackets to conditional statements
// - No longer declaring value in global scope

int MICINP = A5; // set MICINP as P1.5 alias
int GRNLED = P1_6; // set GRNLED as P1.6 alias

void setup()
 // start the serial monitor
 // set GRNLED as output pin
 Serial.println("Setup complete!");

void loop()
 int value; // declare variable value to store result of analogRead
 value = analogRead(MICINP); // get the voltage from the microphone
 Serial.println(value); // write digitized value to serial monitor

 if (value >= 515) // if digitized value is above 560, (here I changed to 515 to make it more sensitive)
 digitalWrite(GRNLED, HIGH);// turn on the LED...
 digitalWrite(GRNLED, LOW); // ...else turn off the LED
 delay(1); // delay in between reads for stability

In order to use analog input pins, we use the “AX” as an alias. All pins in Port 1 may be used as analog input pins (see http://energia.nu/Guide_MSP430LaunchPad.html for more information).

To read the voltage at this pin in the main loop, we use the analogRead() function, which converts the voltage to a 10-bit integer. We also print this value to the Serial monitor and use it to determine whether or not to turn on the green LED.

The value 560 corresponds to a voltage of 3.3 V \cdot {560 \over 1023} = 1.806 V. That is, the LED turns on whenever this voltage (the output of the microphone amplifier) exceeds 1.806 V.



Berkeley Electronic Interfaces Course – Robot Module 3.1 – Comparator

PowerBlocking_WsBrPower Blocking

Instead of powering the Wheatstone bridge with sensor all the time, one should use an output pin from MSP430G2 so that only a period of time the Wheatstone bridge is powered. It’s like taking measurement discontinuously, so to save lots of power consumption.

The way to do is as shown below, using the P1.1 on MSP430G2 as power output and connect it to the front rail, where all sensors are connected.


Now the Wheatstone bridge has been power blocked 🙂


Then we can attach the Wheatstone bridge sensor to the amplifier OPA2344. At the moment the amp work as a comparator, so that the small voltage different on the two Wheatstone output pin will be compared and the signal will be output after the comparator as +3.3V or 0V.OPA2344PA

As shown below that:

  • Firstly connect the amplifier power supply on the grade;
  • Then connect the Vout of Wheatstone bridge to the input pins of OPA2344;
  • Then connect the output of the amp to the pin P1.2 or P1.7 to control, as an example here, the on board red or green light to on and off.



There we go. Now if we boot the chip up then it is easy to test that when light up in front of the photocell, the LED on board will lights, and vise versa.



// EE40LX
// Sketch 3.2
// Description; Power-block a 3.3V rail at P1.1 and subsequently read inputs from
// Wheatstone bridges, connected to P1.2 and P1.7
// Tom Zajdel
// University of California, Berkeley
// July 27, 2015
// Version 1.2 July 27, 2015 - Added curly brackets to conditional statements
// Version 1.1 January 26, 2015 - Fixed a timing bug by using delayMicroseconds()
// and also corrected errors in pin assignment
//see pins_energia.h for more LED definitions

int PBRAIL = P1_1; // set PBRAIL as P1.1 alias
int LPHOTO = P1_2; // set LPHOTO as P1.2 alias
int RPHOTO = P1_7; // set RPHOTO as P1.7 alias

int REDLED = P1_0; // set REDLED as P1_0 alias
int GRNLED = P1_6; // set GRNLED as P1_6 alias

void setup()
 // set power block pin and led pins as outputs
 // set photocell input pins
 pinMode(LPHOTO, INPUT);
 pinMode(RPHOTO, INPUT);


void loop()
 digitalWrite(PBRAIL, HIGH); // supply 3.3V to the power rail
 delayMicroseconds(1000); // delay briefly to allow comparator outputs to settle

 if (digitalRead(LPHOTO) == HIGH) // if LPHOTO is on, turn REDLED on
 digitalWrite(REDLED, HIGH); // otherwise, turn REDLED off
 digitalWrite(REDLED, LOW); 
 if (digitalRead(RPHOTO) == HIGH) // if RPHOTO is on, turn GRNLED on
 digitalWrite(GRNLED, HIGH); // otherwise, turn GRNLED off
 digitalWrite(GRNLED, LOW); 
 digitalWrite(PBRAIL, LOW); // turn the power rail off again
 sleep(19); // wait 19 ms (can do other tasks in this time,
 // but we are simply demonstrating that you can cut power
 // to the circuits for 95% of the time and not notice!


Berkeley Electronic Interfaces Course – Robot Module 2 – Wheatstone Bridge


This second module on the EE40LX course is about using a Wheatstone bridge: any resistive sensor could be used in the bridge. The final robot we build in the course has two photocell “eyes.”

One key note on Wheatstone bridge is that, if one would like the swing of V_{out} to include both positive and negative voltages, one will have to make sure the choices of R1, R2, R4 to have a value between the lower and upper limit of R3, which is a photocell (VT90N1) in this case.


Photocell used here is EXCELITAS TECH VT90N1 LDR, 200KOHM, 80MW, VT900 Series.

Berkeley Electronic Interfaces Course – Robot Module 1 – Power supply


Recently I started to do an online course at edx. It is provided by UC Berkeley and the course is called Electronic Interfaces: Bridging the Digital and Physical Worlds. This course progressively builds on a bouncing robot. I found the teaching and the course content very good and interesting, thus I tried to purchase the parts needed online and now trying to build my own Robot.

And this, is the first module of the final Robot 🙂

In order to make the Robot brain, which is the LaunchPad M430G2, being able to work with a 9V battery, we need firstly to implement a voltage regulator on the breadboard. Basically speaking it takes in 9V and output 3.3V to match the voltage input M430G2.

LM1086 Voltage regulater

I try to use this blog to log the key information for each module so I can reference it later if I forget. I might later come back to add more info about why those capacitors are needed for this voltage regulator. As mentioned in the course the function of capacitors will be discussed in the later part of the course.