Brushing up on Linear Algebra

Hermitian Matrix

Square matrix with complex entries that is equal to its own conjugate transpose.


[3 2+i; 2-i 1]

Positive Definite Matrix

A n x n real matrix M is positive definite if z’Mz > 0 for all non-zero vector z with real # entries.

z*Mz > 0 (for complex or Hermitian Matrix M)


[z_0 z_1] [1 0; 0 1] [z_0; z_1]  = z_0^2 z_1^2

Therefore, [1 0; 0 1] is positive definite


Non-zero vectors that remain parallel to the original vector no matter what matrix (read: transformation) is applied to them.

Av = lamba *v, where lambda is the eigen value of A corresponding to v.

Cholesky Decomposition

Decomposition of a Hermitian, positive-definite matrix into product of lower triangular matrix and its conjugate transpose (take the transpose then negate imaginary parts but not real part). Analogous to taking a square root of a number.

A = LL*, where L is a lower triangular matrix with positive diagonal entries.

As an Outsider Looking In.

It’s beautiful and tragic to stand against the wall, hearing your friends talk about the most vulnerable stories of their life, their suffering, and their private sorrows. It’s beautiful because it enables you to love that person for sharing a part of themselves so freely. It’s tragic because you are the outsider looking in, wanting the change to happen now so you can obliterate the cause of suffering. But we all know that this is wishful thinking for now.

The stories were about living in United States as an undocumented student. By now, I feel familiar with these stories and half expected myself to feel a bit blasé. But these stories are somehow endowed with new meaning from each retelling. They present new revelations about life and about your friends. You marvel at the things you didn’t know even after knowing your friends for more than two years.

Stories were read aloud as part of the reception for a new website launch,  The purpose of the website is to share the personal stories of immigration, especially from the undocumented students, with the wider audience. Many of the stories on the website are from peers who attended the same Summer writing workshop I attended. We gathered in San Francisco and critiqued eachother’s stories about our lives as immigrants.

It was a bond forged in sharing the most intimate part of our lives. The darkest periods that we vowed never to think of again and tucked safely underneath our consciousness. I remember how difficult and frustrating it was to bring them to surface again. But we agreed tonight that the experience of writing and reading them aloud healed us in many ways. The broken immigration system in United States had scarred us in ways that we were not aware of until we reflected and shared. We healed through opening up and acknowledging that there is a community of us who went through the same trials.

The stories are usually about parting with loved ones. Grandfathers and grandmothers whom we left behind, knowing we’ll never see them again. It’s about the everyday objects from our previous lives that seem so precious in our memory now. The sibling who’s left behind without visitation rights. About wanting to apply for scholarships that demand social security numbers. The feeling of being considered an unwanted outsider even when we are not.

My friends ask me what I’m up to these days. I fumble through my answer, feeling tremendous guilt by the fact that I have a job at NASA when my friends are barred from working anywhere. Our only difference is the nine-digit number that grants a work authorization. We all find it silly and confusing that these nine digits can hold so much power. I feel as if I am looking in from the world of opportunity, only nine-digits away from the world that stops dreams midway through their fruition. I’m looking forward to the day when this separation no longer exist; a land where all of us live free of anxiety and fear, a land where everyone is truly equal.

Writing Efficient Code for OpenCL Applications

This is the part 2 of the OpenCL Webinar Series put out by Intel. There were some good information about optimization in general but lot of the information focused on how to optimize OpenCL running on 3rd Gen Intel Core Processor. I jot down some optimization notes that are applicable to all processors.

Avoid Invariant in OpenCL Kernel
Anything that’s independent from kernel execution (invariants, constants), move it to the host.

__kernel void test1 (__global int* data, int2 size, int base) {
       int offset = size.y * base + size.x;
       float offsetF = (float)offset;

__kernel void test1 (__global int* data, int offset, float offset) {

Avoid initialization in Kernel
Move one time initialization to the host or in a different kernel.

__kernel void something (__global int* data) {
         size_t tid = get_global_id(0)
         if (0== tid) {
                 //Do Something

Use the Built in Functions
i.e. dot, hypot, clamp, etc.

Trade Accuracy vs Speed
If the output is correct, look at the performance and use “mad” or “native_sin”

Get rid of edge conditions

__kernel void myKernel(__global int* data, int maxR, int maxC){
         int row = get_global_id(0);
         int col = get_global_id(1);

         if (row > maxRow){
         else if (col > maxCol) {

Use the ND Range (work within a smaller range. Just avoid the conditionals…)
Use the padded buffers…

Get rid of edge conditions in general
Use logical ops instead of comparison/conditions.

Reduce number of registers to increase parallelism

Avoid Byte/Short Load and Stores
Use load and store in a greater chunk.
i.e. uchar -> uint4

Don’t use too many barriers in kernel
You are asking everything to wait until the work items are done.
If the work item doesn’t run, the code may hang

Use the vector types (float8, double4, int4, etc) for better performance on the CPU

Preferred work group size for kernels is 64 or 128 work items.
-Workgroup size multiple of 8
CL_KERNEL_PREFERRED_WORK_GROUP_SIZE_MULTIPLE parameter by calling to clGetKernelWorkgroupInfo.
-Match number of workgroups to logical cores.

Buffer Object vs. Image Objects
CPU Device:
– Avoid using Image Object and use Buffer object instead.

Brushing up on Probabilities, Localization, and Gaussian

Gaussian: It is a bell curve characterized by mean and variance. It’s is unimodal and symmetric. The area under the Gaussian adds up to 1.

Variance: measure of uncertainty. Large covariance = more spread = more uncertain.

Bayes Rule


Involves “move” (motion) step and “sense” (measurement) step.

Motion (move) : First the robot moves. We use convolution to get the probability that robot moved to the current grid location. We use Bayes Rule (given previous location, find probability of being in this current grid location).

Measurement (sense) : Then robot senses the environment. We use products to get the probability that the sensor measurement is correct. Measurement applies theorem of total probability (sum of:  probability that sensor measurement is correct given it’s a hit, prob that sensor is correct given it’s a miss).

*Side note: for grid based localization method (histogram method), the memory increases exponentially with number of state variables (x,y,z, theta,row, pitch, yaw, etc)


Moonrise Kingdom

My roommate, Nicole, invited me to the showing of Moonrise Kingdom as part of her birthday celebration. I had seen the trailer and gathered that it’s a film about two kids in New England who decide to runaway together. The film takes place in the sixties. I would have dismissed the movie if it wasn’t for the cast and the pretty movie trailer (and the 94% rotten tomatoes rating also helped). I swept aside my Asian guilt (I thought about spending this evening coding) and joined her and her friends for the movie showing.

Moonrise Kingdom, whose title sounds like a Chinese martial arts movie, is so pretty. It’s like candy to your eyes and ears. It’s so pretty that I wanted to take frames out of it and post it around my room. Every scene is like a vintage polaroid photo. An advertisement from the 60s. It’s as if Wes Anderson shot the movie through Instagram.

The plot is simple and yet each moment is full of innocence, wonderment, and adventure. And Wes Anderson  portrays children as complete human being with full mental faculty and emotional complexity! But frankly all that stands out in my memory is the color palette. The sepia and pink hues, the golden fields, washed out blues, red and green standing out against the desaturated background. There is such a decisive and consistent look throughout the film.

I guess one thing that slightly bothered me was the visual imagery of the night time scene. The rain is pouring and it’s the evening. And either the director or the post-production crew decided to blue-filter the sh#$ out of it. So everyone’s faces look blue. Like ghosts. I didn’t particularly like the look but maybe it was the look they were going for.

I appreciated that the film didn’t water down childhood. I find that childhood portrayed through Hollywood is either idealistic, fantastical, really sad and dreary, or some other end of the spectrum. The subtlety and complexity is usually entirely missing. Wes Anderson deals with it very delicately, being careful not to shift to the extremes I mentioned. And the acting from these kids is amazing. I loved the gaze of the main character (the girl). It’s always distant and full of meaning, and we can never really guess all that goes inside her head.

Overall, I loved it. Two hours well spent and coding could wait.

Intel SDK for OpenCL Applications Webinar Series 2012

Intel hosted a webinar on running OpenCL on Intel Core processor. The webinar I attended this morning (9am, July 11th), is first part of the three-part webinars on this topic. It was well organized and educational and I think the next seminar will be even more useful (since it deals with programming using OpenCL. I took notes during the webinar to get you up to speed in case you want to attend the next two seminars.

* July 18-Writing Efficient Code for OpenCL Applications<> 
* July 25-Creating and Optimizing OpenCL Applications<> 

OpenCL: Allows us to swap out loops with kernels for parallel processing.

Introduction: Intel’s 3rd Generation Core Processor.

  • Inter-operability between CPUs and HD Graphics.
  • Device 1: maps to four cores of intel processor (CPUs)
  • Device 2: Intel HD Graphics.
  • Allows access to all compute units available within system (unified compute model – CPU and HD Graphics)
  • Good for multiple socket cpu – if you want to divide the openCL code with underlying memory architecture.
  • Supported on Window7 and Linux.

General Electric’s use of OpenCL

  • GE uses OpenCL for image reconstruction for medical imaging (O(n^3) – O(n^4))
  • Need unified programming model for CPUs and GPUs
  • OpenCL is most flexible (across all CPU and GPUs) – good candidate for unified programming language.
  • Functional Portability: take OpenCL application and run it on multiple hardware platforms and expect it to produce correct results.
  • Performance Portability: functional Portability + Deliver performance close to entitlement performance (10-20%)
  • Partial Portability: functional Portability + only host code tuning is required.
  • Benefits of OpenCL:
    • C like language – low learning curve
    • easy abstraction of host code (developers focus on kernel only)
    • easy platform abstraction (don’t need to decide platform right away.)
    • development resource versatility (suitable for mult. platforms)
  • Uses combination of buffers (image buffers and their customized ones). Image buffers allow them to use unique part of GPU.
  • Awesome chart that compares various programming models:

Image courtesy of Intel SDK for OpenCL Webinar.

Intel OpenCL SDK: interoperable with Intel Media SDK with no copy overhead on Intel HD Graphics.

Intel Media SDK: hardware accelerated video encode/decode and predefined set of pre-processing filters

Thank you UC Berkeley Visual Computing Center for letting me know about this webinar series!