Monday, December 14, 2015

LESS command search fails with super-long lines


Symptom -- if you navigate to the correct region of the big text file, then the search does hit the target word.


Solution -- use grep to confirm

Friday, December 11, 2015

C++ one source file -> one object file

Not sure if you can compile multiple source files into a single *.obj file...

http://stackoverflow.com/questions/6264249/how-does-the-compilation-linking-process-work:

Compilation refers to the processing of source code files (.c, .cc, or .cpp) and the creation of an 'object' file. This step doesn't create anything the user can actually run!

Instead, the compiler merely produces the machine language instructions that correspond to the source code file that was compiled. For instance, if you compile (but don't link) three separate files, you will have three object files created as output, each with the name .o or .obj (the extension will depend on your compiler). Each of these files contains a translation of your source code file into a machine language file -- but you can't run them yet! You need to turn them into executables your operating system can use. That's where the linker comes in.

0 probability ^ 0 density, again

See also the earlier post on 0 probability vs 0 density.

[[GregLawler]] P42 points out that for any continuous RV such as Z ~ N(0,1), Pr (Z = 1) = 0 i.e. zero point-probability mass. However the sum of many points Pr ( |Z| < 1 ) is not zero. It's around 68%. This is counterintuitive since we come from a background of discrete, rather than continuous, RV.

For a continuous RV, probability density is the more useful device than probability of an event. My imprecise definition is

prob_density at point (x=1) := Pr(X falling around 1 in a narrow strip of width dx)/dx

Intuitively and graphically, the strip's area gives the probability mass.

The sum of probabilities means integration , because we always add up the strips.

Q: So what's the meanings of zero density vs zero probability? This is tricky and important.

In discrete RV, zero probability always "impossible outcome" but in continuous RV, zero probability could mean either
A) zero density i.e. impossible outcome, or
B) positive density but a strip width of 0

Eg: if I randomly selects a tree in a park, Pr(height > 9KM) = 0. Case A. For Case B, Pr (height = exactly 2M)

0 probability ^ 0 density, briefly

(label:clarified, math_stat)

See the other post on 0 probability vs 0 density.

Eg: Suppose I ask you to cut a 5-meter-long string by throwing a knife. What's the distribution of the longer piece's length? There is a pdf f(x). Bell-shaped, since the most people will aim at the center.

This f(x) = 0 for x > 5 i.e. zero density.

For the same reason, Pr(X > 5) = 0 i.e. no uncertainty, 100% guaranteed.

Here's my idea of probability density at x=4.98. If a computer simulates 100 billion trials, will there be some hits within the neighbourhood around x=4.98 ? Very small but positive density. In contrast, the chance of hitting x=5.1 is zero no matter how many times we try.

By the way, due to the definition of density function, f(4.98) > 0 but Pr(X=4.98) = 0, because the range around 4.98 has zero width.

Thursday, December 10, 2015

Applying Ito's formula on math problems -- learning notes

Ito's formula in a nutshell -- Given dynamics of a process X, we can derive the dynamics[1] of a function[2] f() of x .

[1] The original "dynamics" is usually in a stoch-integral form like

  dX = m(X,t) dt + s(X,t) dB

In some problems, X is given in exact form not integral form. For an important special case, X could be the BM process itself:

  Xt=Bt

[2] the "function" or the dependent random variable "f" is often presented in exact form, to let us find partials. However, in general, f() may not have a simple math form. Example: in my rice-cooker, the pressure is some unspecified but "tangible" function of the temperature. Ito's formula is usable if this function is twice differentiable.

The new dynamics we find is usually in stoch-integral form, but the right-hand-side usually involves X, dX, f or df.

Ideally, RHS should involve none of them and only dB, dt and constants. GBM is such an ideal case.

Tuesday, December 1, 2015

DOS FINDSTR/FIND -- like grep

1) FINDSTR? I use findstr more

2) FIND?
Note Search-string is case sensitive.

!Must double-quote search-string. (Single quote doesn't work.)

ftype | find "Python"
assoc | find "pyw"

Saturday, November 21, 2015

mv-semantic and RVR - Lesson #1

#9 std::move()
#8 STL containers using mv-semantic -- confusing if we don't have a firm grounding on ...

#7 mv ctor and mv assignment -- all the fine details would be poorly understood if we don't have a grip on ...

#5 RVR -- is a non-trivial feature by itself. First, we really need to compare...

#3 rval expression vs lval expression -- but the "expression" bit is confusing!

#1 lval expression vs lval variable vs objects in a program
This is fundamental concept.

RVR - exclusively(?) used as function parameter

RVR - exclusively(?) used as function parameter

There could be tutorials showing other usages, like a regular variable, but i don't see any reason. I only understand the motivation for
- move-ctor/move-assignment and
- insertions like push_back()

When there's a function taking a RVR param, there's usually a "traditional" overload without RVR, so compiler can choose based on the argument:
* if arg is an rval expression then choose the RVR version [1]
* otherwise, choose the traditional

[1] std::move(myVar) would cast myVar into an rval expression. This is kind of the converter between the 2 overloads.

In some cases, there exists only the RVR version, perhaps because copy is prohibited... Perhaps a class holding a FILE pointer?

Tuesday, November 10, 2015

c# delegates - 2 fundamental categories

Update: java 8 lambda. [[mastering lambds]] P 5 mentions AA BB DD as use cases for Command pattern

AA -- is the main topic of the book
BB -- https://docs.oracle.com/javase/tutorial/java/javaOO/lambdaexpressions.html
DD -- http://www.oracle.com/technetwork/articles/java/lambda-1984522.html and http://www.drdobbs.com/jvm/jdk-getting-a-head-start-with-lambdas/240008174
--
Today I finally feel ready to summarize the 2 usage categories of c# delegates. If you are new like me, it's better to treat them as 2 unrelated constructs. Between them, the underlying technical similarities are relevant only during initial deep dive, and become less important as you see them in various patterns (or "designs").

In java, these features are probably achieved using interface + other constructs. Dotnet supports interfaces too, but in some contexts offers more "specialized" constructs in the form of delegates. As shown below, in such a context interfaces are usable but less than delegates.

Most tutorials would start with unicast delegate, without the inv list.

Coming from a java or c++ background, you will want to find a counterpart or close equivalent to delegates. Almost hopeless for BB. For AA there are quite a few, which adds to the confusion.

AA) lambda as function AAArgument
* I believe compiler converts such a lambda into a STATIC method (there's really no host object) then bind it to a delegate Instance
* often returns something
* Usually _stateless_, usually without captured variables.
* predominantly unicast
* Action, Func and Predicate are often used in this context, but you can also use these 3 constructs as closures, which is a 3rd category beyond AA and BB
* Domain? Linq; functor argument as per Rajesh of OC
* Before java 8, often implemented as an interface or an anonymous inner class

BB) event field Backed by a multicast delegate instance (See other posts on "newsletter")
* Usually NON-STATIC methods, esp. in GUI context
* Usually returns void
* A callback method is usually _stateful_, because the callback method often does its job with "self-owned" i.e. this.someField objects, beyond the argument objects passed in
* event is a pseudo _f i e l d_ in a class. Static field or non-static field are both common.
* Domain? GUI + asynchronous WCF
* we must register the callback method with the event source.
** can lead to memory leak
* in java, implemented by several callback Objects supporting a common interface (cf swing)


Some of the other categories --
CC) CCClosure -- stateful, seldom passed in as a func arg
DD) thread start
EE) Passing in a delegate instance as a function input is the lambda usage (AA). How about returning from a function? P83 [[c#precisely]] shows examples to return an anon delegate instance. This is an advanced and non-trivial unicast use case. The underlying anon method is usually static and stateless

In a source code with AA, we see lots of custom delegate Types created on the fly.

In a source code with BB, delegate Instances take center stage, since each instance is assigned to a ... pseudo field. The delegate type is less complicated.

Thursday, November 5, 2015

op-new : no DCBC rule

B's op-new is bypassed by D's op-new [1]
B's ctor is always used (never bypassed) by D's ctor.

This is a interesting difference.

Similarly, an umbrella class's op-new [1] would not call a member object's op-new. See [[more effC++]]

These issues are real concerns if you want to use op-new to prohibit heap instantiation of your class.

See http://bigblog.tanbin.com/2012/01/dcbc-dtor-execution-order.html


[1] provide these two classes define their op-new.

By the way, op-new is a static member operator, but is still inherited.

Saturday, October 31, 2015

Fwd: RVR (rvalue reference) and mv-semantics - MSDN article

http://blogs.msdn.com/b/vcblog/archive/2009/02/03/rvalue-references-c-0x-features-in-vc10-part-2.aspx is one of the best articles on this confusing topic, written by someone in the Visual c++ core team.

One comment says "…this is mainly a feature for use in library-code. It's mostly transparent to client-code which will just silently benefit from it". How true!

I always believed this is a non-trivial topic. Many authors try to dumb it down and make it accessible to the mere mortals, but I feel a correct understanding takes a lot of effort.

Scott Meyers articles are also in-depth but seem to skip the basics.

As I mentioned earlier, [[c++ primer]] has the best deep-intro on this confusing topic. Again, the author is another legend in the c++ community.

Tuesday, October 27, 2015

algo IV: binary matrix { raiserchu

Q: Given a 2D binary matrix filled with 0's and 1's, find the largest rectangle containing all ones and return its area. See raiserchu's mail on 12 Sep 13.

Analysis: Each rectangle is identified by 2 vertices, i.e 4 integers. Without loss of generality, We require the "high" corner to have higher x-coordinate and higher y-coordinate than the "low" corner. (We can assume y-axis run upward.) With this nested loop we can iterate over all possible rectangles in a given matrix:

Lock low corner
Move high corner in typewriter (zigzag) steps i.e.
  hold highY and move highX
  process the resulting rectangle
  increment highY and repeat
Move the lower corner in typewriter steps and repeat

Key observation: any "bad pixel" disqualifies every rectangle containing it.

--- My solution 2:
1) Save all bad pixels in SQL table Bad, 
indexed by x-coordinate and 
indexed by y-ordinate

Table can be in-memory like Gemfire. Many sorted maps (skiplist or RB tree) support range selection. Gelber interviewer showed me how to use a SQL table to solve algo problems.

2) Follow the nested loop to iterate over all possible rectangles, either disqualify it or save/update its area in maxFound. Here's how to disqualify efficiently:

For each rectangle under evaluation, we have 4 numbers (lowX, lowY) and (highX, highY).

select ANY from Bad where lowX < Bad.x < highX and lowY < Bad.y < highY

If any hit, then rectangle disqualified. In fact all high corners at the same horizontal level disqualify, so in the nested loop we skip ahead to increment highY

3) At end of nested loop, maxFound is the final answer.
--- my earlier solution 1:
1) Iterate over all possible rectangles and save them in a SQL table Rec, indexed by the 4 integers. No need to validate each (time-consuming). Next we start elimination
2) Iterate over all bad pixels. For each bad pixel found, delete from Rec where Rec.lowX < X < Rec.highX and Rec.lowY < Y < Rec.highY

Now all remaining rows are valid candidates
3) max ( (x2 - x1)*(y2 - y1) )
--- Here's my partial solution:
We can effectively ignore all the "good pixels".

1) Look at the x coordinates of all bad pixels. Sort them into an array. Find the largest gap. Suppose it's between x=22 and x=33. Our candidate rectangle extends horizontally from 23 to 32, exactly. Notice there's no bad pixel within this vertical band [1].
2) Look at the y coordinates of all bad pixels. Sort them into an array. Find the largest gap. Suppose it's between y=15 and y=18. Our candidate rectangle extends vertically from 16 to 17, exactly.
[1] This candidate rectangle can expand All the way vertically, though it may give a bigger rectangle
Ditto horizontally.

EPI300 skyline problem

PROBLEM STATEMENT
We have to design a program which helps drawing the skyline of a two-dimensional city given the locations of the rectangular buildings in the city. All of the buildings are built on a flat ground (that is they share a common bottom) and each building Bi is represented by a triplet of (li, ri, hi) where li and ri are the left and right coordinates, respectively, of the ith building, and hi is the height of the building. In the diagram below there are 8 buildings, represented from left to right by the triplets (1, 5, 11), (2, 7, 6), (3, 9, 13), (12, 16, 7), (14, 25, 3), (19, 22, 18), (23, 29, 13) and (24, 28, 4).


Input
The input of the program is a sequence of building triplets. The triplets are sorted by li (the left x-coordinate of the building) in ascending order.

Q1: For any given point X on the x-axis, output the height on the skyline
A1(brute force): Select max(h) from Table where l < x < r. Use an in-memory database.

Q2: draw the skyline. (I think this can benefit from Q1).
A2: Sort all the li and ri individually (8x2 values in our eg.) Each is an Edge struct {
  A pointer to the building struct;
  bool isLeft, isRight
}
This sorted edges is the array "theEdges".

During the scan, we will maintain a sorted list (skiplist or RB-tree) of "alive" buildings, sorted by height.

Iterate through theEdges,
* If we encounter a left edge, we insert the associated building into "alive". If its height is higher than the currentHeight, then update currentHeight (initialized to 0).
* If we encounter a right edge, we remove the building from "active". If its height == currentHeight, then we update currentHeight by recomputing the highest (among the remaining "alive"), which could be 0, unchanged or a lower height.

Monday, October 26, 2015

coding IV - favored by the smartest companies

XR,

1a) fundamental -- Some Wall St (also in Asia) tough interviews like deep, low-level (not obscure) topics like threading, memory, vtbl, RB-tree ..

1b) lang -- Some (IMO mediocre) interviewers (like programming language lawyers) focus on language details unrelated to 1a)

2) Some (IMO mediocre) interviewers like architecture or high level design questions (excluding algo designs) like OO rules and patterns but I feel these are not so tough.

3) algo -- Some tough interviews focus on algo problem solving in pseudo-code. See http://bigblog.tanbin.com/2007/09/google-interviews-apply-comp-science.html. I got these at Google and Facebook. Some quants get these too.

0) Coding question is another tough interview type, be it take-home, onsite or webex. 

With some exceptions (like easy coding questions), each of these skills is somewhat "ivory tower" i.e. removed from everyday reality, often unneeded in commercial projects. However these skills (including coding) are heavily tested by the smartest employers, the most respected companies including Google, Facebook, Microsoft... You and I may feel like the little boy in "emperor's new dress", but these smart guys can't all be wrong. 

I will again predict that coding questions would grow more popular as the logistic cost is driven down progressively.

Candidate screening is tough, time consuming and, to some extent, hit-and-miss. With all of these screening strategies, the new hire still can fail on TECHNICAL ground. Perhaps she/he lacks some practical skills -- Code reading; debugging using logs; automation scripts; tool knowledge (like version control, text search/replace tools, build tools, and many linux commands)

Needless to say, new hire more often fail on non-technical ground like communication and attitude -- another topic altogether.

In terms of real difficulty, toughest comp science problems revolve around algorithm and data structures, often without optimal solutions. Algo interview questions are the mainstay at big tech firms, but not on Wall St. Some say "Practice 6 months". Second toughest would be coding questions --
* Often there's too little time given
* Sometimes interviewer didn't like our style but onsite coding tends to focus on substance rather than style.

Tan bin

P.S. I had webex style coding interviews with 1) ICE Feb 2011, 2) Barclays (swing), 3) Thomson Reuters, 4) Bloomberg

P.S. I tend to have more success with algo interviews and onsite coding than take-home coding. See blog posts.. (
http://tigertanbin2.blogspot.com/2015/05/sticky-weakness-revealed-by-interviews-c.html
http://tigertanbinpripri.blogspot.com/2015/03/high-end-developer-interviews-tend-to.html
)

Saturday, October 24, 2015

mv-semantics | the players

Q: resource means?
A: Some expensive data item to acquire. Requires allocating from some resource manager such memory allocator. Each class instance holds a resource, not sharable.

Q: What's the thing that's moved? not the class instance, but part of the class instance. A "resource" used by the instance, via a pointer field.

Q: Who moves it? the compiler, not run time. Compiler selects the mv-ctor or mv-assignment, only if all conditions are satisfied. See post on use-case.

Q: What's the mv-ctor (or mv-assignment) for? The enabler. It alone isn't enough to make the move happen.

Q: What does the RVR point to? It points to an object that the programmer has marked for imminent destruction.

Saturday, September 26, 2015

[[c++recipes]] mv-semantic etc

I find this book rather practical. Many small programs are fully tested and demonstrated.

This 2015 Book covers cpp14.

--#1) RVR(rval ref) and move semantic:
This book offers just enough detail (over 5-10 pages) to show how move ctor reduces waste. Example class has a large non-ref field.

P49 shows move(), but P48 shows even without a move() call the compiler is able to *select* the move ctor not copy-ctor when passing an instance into a non-ref parameter. The copy ctor is present but skipped!

P49 shows an effective mv-ctor can be "=default; "

--custom new/delete to trace memory operations
Sample code showing how the delete() can show where in source code the new() happened. This shows a common technique -- allocating an additional custom memory header when allocating memory.

This is more practical than the [[effC++]] recipe.

There's also a version for array-new. The class-specific-new doesn't need the memory header.

--other
A simple example code of weak_ptr.

a custom small block allocator to reduce memory fragmentation

Using promise/future to transfer data between a worker thread and a boss thread

Friday, September 25, 2015

mv-semantic | && param OR "move()" unneeded?

The rules in the standard and in the implementations are not straightforward. Let's try to have some simple guidelines:

The && parameter (in the function) -- So far, the evidence is inconclusive so I will assume we always need a RVR parameter.

The call to move() -- like on P47 [[c++ recipes]], sometimes without move() the move still happens, but I don't understand the rules so I will assume I should always call move() explicitly.


mv-semantic | keywords

I feel all the tutorials seem to miss some important details and selling a propaganda. Maybe [[c++ recipes]] is better?

[s = I believe std::string is a good illustration of this keyword]

[2] readonly - Without mv-semantic, we need to treat the original instance as readonly.

[2] allocation - mv-semantic efficiently avoids memory allocation on heap or on stack

[2] pointer field - every tutorial shows a class with a pointer field. Note a reference field is much less common.

[2] deep-clone - is traditional. Mv-semantics uses some special form of shallow-copy. Has to be carefully managed.

[s] temp - the RHS of mv-semantic must strictly be a temp object. I believe by using the move() function and the r-val reference (RVR) we promise to the compiler not to access the temp object afterwards. If we access it, i guess bad things could happen. Similar to UndefBehv? See [[c++standard library]]

promise - see above

containers - All standard STL container classes[1] provide mv-semantics. Here, the entire container instance is the payload! Inserting a float into a container won't need mv-semantics.

[1] including std::string

expensive -- allocation and copying assumed expensive. If not expensive, then the move is not worthwhile.

cripple -- the source object of the move is crippled and should not be used afterwards. Its resource is already stolen, so the pointer field should be set to NULL.

--------
http://www.boost.org/doc/libs/1_59_0/doc/html/move/implementing_movable_classes.html says
Many aspects of move semantics can be emulated for compilers not supporting rvalue references and Boost.Move offers tools for that purpose

I think this sheds light...

mv-semantic | use cases rather few

I think the use case for mv-constructs is tricky. In many simple contexts mv-constructs actually don't work.

Justification for introducing mv-semantic is clearest in one scenario -- a short-lived but complex stack object is passed by value into a function. The argument object is a temp copy -- unnecessary.

Note the data type should be a complex type like containers (including string), not an int. In fact, as explained in the post on "keywords", there's usually a pointer field and allocation.

Other use cases are slightly more complex, and the justification is weaker.

Q: [[c++standard library]] P21 says ANY nontrivial class should provide a mv ctor AND a mv-assignment. Why? (We assume there's pointer field and allocation involved if no mv-semantics.)
%%A: To avoid making temp copies when inserting into container. I think vector relocation also benefits from mv-ctor

[[c++forTheImpatient]] P640 shows that sorting a vector of strings can benefit from mv-semantic. Swapping 2 elements in the vector requires a pointer swap rather than a copying strings

mv-semantic | returning RVR

[[the c++Standard library]] P23 explains the rules about returning rval ref. It's obscure and hard to remember. My rule  of thumb is to avoid returning RVR. I don't see any good use case.

Sunday, September 20, 2015

lambda meets template

In cpp, java and c#, The worst part of lambda is the integration with (parametrized) templates.

In each case, We need to understand the base technology and how that integrates with templates, otherwise you will be lost. The base technologies are (see post on "lambda - replicable")
- delegate
- anon nested class
- functor

Syntax is bad but not the worst. Don't get bogged down there.

lambda is more industry-standard than delegate

Before java and c++ introduced lambada, I thought delegate is the foundation of lambdas.

Now I think lambda is an industry standard, implemented differently in c++ and java. See post on "lambda - replicable". For python...

Bear in mind
A) the most fundamental, and pure definition of lambda -- a function as rvalue, to be passed in as argument to other functions.

B1) the most common usage is sequence processing in c#, java and c++
* c# introduced lambda along with linq
* java introduced lambda along with streams

B2) 2nd common usage is event handler including GUI.

See post on "2 fundamental categories"

noSQL 2 categories - KV ^ json/xml document store

Xml and json both support hierarchical data, but as a data type, each document is the payload.

This is the 2nd category of noSQL system. #1 category is the key-value store, the most common category.

coherence/gemfire/gigaspace - KV
terracotta - KV
memcached - KV
oracle NoSQL - KV
Redis - KV
Table service (name?) in Windows Azure - KV

CouchDB - document store (json)
mongo - document store (json)

big data feature - variability in value (4th V)

RDBMS - every row is considered "high value". In contrast, a lot of data items in a big data store is considered low-value.

The oracle nosql book refers to it as "variability of value". The authors clearly think this is a major feature, a 4th "V" beside Volume, Velocity and Variety of data format.

Data loss is often tolerable in big data, never acceptable in RDBMS.

Exceptions, IMHO:
* columnar DB
* Quartz, SecDB

Saturday, September 19, 2015

big data key feature - inexpensive hardware

See post on variability

Economics -- data volume often necessitates inexpensive storage. Commodity hardware is a key feature of big data.

"Inexpensive" helps scale-out (aka horizontal scaling). Just add more nodes. In contrast, RDBMS requires scale-up to bigger machines. See other posts on scale-out.

big data feature - scale out

Scalability is driven by one of the 4 V's -- Velocity, aka throughput.

Disambiguation: having many machines to store the data as readonly isn't "scalability". Any non-scalable solution could achieve that without effort.

Big data often requires higher throughput than RDBMS could support. The solution is horizontal rather than vertical scalability.

I guess gmail is one example. Requires massive horizontal scalability. I believe RDBMS also has similar features such as partitioning, but not sure if is economical. See posts on "inexpensive hardware".

The Oracle nosql book suggests noSQL compared to RDBMS, is more scalable --- 10 times or more.

noSQL feature - ACID

See post on "variability", the 4th V of big data.

A noSQL software could support transactions as RDBMS does, but the feature support is minimal in noSQL, according to the Oracle noSQL book.

Transactions slow down throughput, esp. write-throughput.

In a big data site, not all data items are high value, so ACID properties may not be worthwhile.


noSQL feature #1 - unstructured

I feel this is the #1 feature. RDBMS data is very structured. Some call it rigid.
- Column types
- unique constraints
- non-null constraints
- foreign keys...
- ...

In theory a noSQL data store could have the same structure but usually no. I believe the noSQL software doesn't have such a rich and complete feature set as an RDBMS.

I believe real noSQL sites usually deal with unstructured data. "Free form" is my word.

Rigidity means harder to change the "structure". Longer time to market. Less nimble.

What about BLOB/CLOB? Supported in RDBMS but more like a afterthought. There are specialized data stores for them. Some noSQL software may qualify.

Personally, I feel RDBMS (like unix, http, TCP/IP...) prove to be flexible, adaptable and resilient over the years. So I would often choose RDBMS when others prefer a noSQL solution.

Tuesday, September 15, 2015

friends' comments on slow coder and deadlines

Wall street systems are stringent about timeline, less about reliability, even less about maintainability or total cost of ownership or Testing. I feel very few (like 5%) Wall St systems are high precision, including the pricing, risk, trade execution systems. Numerical accuracy is important to them though...

In City, Boris’s code is thrown out because it didn’t make it to production. Any production code is controlled not by dev team but many many layers of control measures. So my prod code in Citi will live.

If you are slow, Anthony Lin feels they may remove you and get a replacement to hit the deadline. If they feel it’s hard to find replacement and train him up, then they keep you – all about timelines.

Fiona Hou felt your effort does protect you – 8.30 – 6pm everyday. If still slow, then manager may agree estimate is wrong.  However, if you are obviously slower than peers, then boss knows it.

Friday, September 11, 2015

Quartz ^ java job in hindsight

When taking up Quartz, I decided to avoid the diminishing return of java. I  believed (and still believe) the initial learning in any technology is fastest. But was the enrichment worthwhile? Now I think yes and no.
- yes more java probably wouldn't have improved my technical capabilities.
- no I didn't become a "grown-up" python veteran or deepen my domain knowledge in curve fitting, real time risk, risk attribution, risk bucketting..

(Below is Adapted from a letter  to Hu)

Most of my developer friends (mostly senior developers) in Goldman and Baml shun the in-house frameworks (SecDB or Quartz). Key reason cited is skill mobility and market value. There are other reasons unmentioned. Indeed, we have to spend many late hours learning the idiosyncrasies of the framework, but that value...

(Quartz position is predominantly programming) Someone said if you stay outside mainstream programming technology like java/c#/c++ for a few years (like 3Y)  then you will forget most of the details and have a hard time on the job market -- 1) getting shortlisted or 2) passing the tech screening.

I thought I could learn some serious python programming but left disappointed. I didn't use a lot of python constructs, python hacks, python best practices, python modules (there were so many). I don't have a good explanation -- I feel most of the technical issues we faced were Quartz-specific issues like DAG, Sandra .... For an analogy, I can give you a 3Y job to develop c++ instrument drivers but restrict  you to my framework "CVI" so you are shielded from basic pointers, smart pointers, STL, threading, virtual functions, Boost, the big 3 (copy ctor, assignment operator, dtor) or any of the c++11 features. Over 3 years you would learn close to nothing about c++, though you would learn a great deal about CVI (C for Virtual Instrumentation, used in my early jobs.) So over 9 months I didn't go up a rewarding learning curve in Python and emerge a "grown-up" developer in python.

Besides, I hoped to learn some financial analytics, pricing, risk, stress test, back test, linear algebra, matrix operation, curve fitting, etc. Not much. Would a longer stay help? Maybe, but I doubt it.

It's true that many Quartz projects give you more direct exposure to "domain knowledge", since most of the low-level nuts and bolts are taken care of by the Quartz framework. In my projects we worked directly with trades, live rates, accounts, PnL, PV, risk sensitivity numbers, curves, risk buckets, carry costs, and (a large variety of) real securities,  ... 

However, there was no chance to work on the quant library as those modules belong to other teams in a well-organized dev organization like Quartz.

Resolving technical problems is always frustrating (until you have 3 years experience in Quartz) because the knowledge is kept in people's heads, so you can't google it. The documentation appears extensive, but actually very very limited compared to the technical resources on Python or any popular language.

Quartz has good dev practices like automated build/release, test coverage, code reviews, so you might benefit by learning these best practices.

Database technology is an essential skill. If you like object-oriented DB, then Quartz gives you the exposure, though the valuable skill of developing an OODB is reserved by the early pioneers of Quartz. In my opinion, relation db RDBMS) is more widely used then OODB, so it's more valuable to learn how to write complex SQL queries and tune them.

Quartz offers elaborate event subscription, so changes (on market prices or positions or user input or configuration) propagate throughout the OODB fabric in real time. I think this is one of the sexy (and powerful, valuable, enabling) features so I was excited initially. But it's too complex too big. I used this feature extensively but didn't learn enough to benefit. Analogy -- you can learn many Korean words but still not enough to converse, let alone using it effectively to communicate.


Thursday, September 10, 2015

equivalent FX trades, succinctly

The equivalence among FX trades can be confusing to some. I feel there are only 2 common scenarios:

1) Buying usdjpy is equivalent to selling jpyusd.
2) Buying usdjpy call is equivalent to buying jpyusd put.

However, buying a fx option is never equivalent to selling an fx option. The seller wants (implied) vol to drop, whereas the buyer wants it to increase.

Saturday, September 5, 2015

left skew~left side outliers~mean pulled left

Label - math intuitive

[[FRM]] book has the most intuitive explanation for me - negative (or left) skew means outliers in the left region.

Now, intuitively, moving outliers further out won't affect median at all, but pulls mean (i.e. the balance point) to the left. Therefore, compared to a symmetrical distribution, mean is now on the LEFT of median. With bad outliers, mean is pulled far to the left.

Intuitively, remember mean point is the point to balance the probability "mass".

In finance, if we look at the signed returns we tend to find many negative outliers (far more than positive outliers). Therefore the distribution of returns shows a left skew.

Friday, August 21, 2015

I feel productive with coding when ...

... when I google and find solutions or "machine-verifiable" insights
... when I spend time hacking code at home
... whenever I hook up a debugger
... when I managed to gain "control" over an opaque/complex system

! This is in stark contrast to textbook theories.


I feel the thrill when I get the code to work. In terms of my long-term career and long term financial stability, I feel this interest and satisfaction is arguably more important than my aptitude.

Open source systems are more rewarding than Microsoft technology. I would say scripting tools are esp. rewarding.


Monday, August 10, 2015

switching between LESS, vi, TAIL ...

less -> vi : "v"

tail -> less? Nothing found.

--from within vi,
switch to readonly mode: qq( :se ro )
switch to readwrite mode: qq( :se noro )

---less to replace tail -f
$ less /var/log/messages

Common-mode -> Follow-mode: F
Follow-mode -> Common-mode: Ctrl-c

To start less in the Follow-mode,
$ less +F /var/log/messages

Sunday, August 9, 2015

feeling competent about linux, java, perl, .. not cpp/c#

Nowadays I feel more competent developing in linux than windows. If you look hard you realize this is relative to team competence. If my team member are familiar with linux but struggling in windows, then I would feel very differently! Same thing could happen between java and Quartz -- in a java team, my Quartz know-how may be quite visible and hard to emulate.

Perhaps the defining examples is Perl. In so many teams no one knows more about perl than me. When perl was important to the team, I felt very competent and strong, and almost effortless. In GS, there were several perl old-hands, but I still felt on-par. Key reason - not many people specialize in perl, and also none of the OO stuff was important in those projects.

I gradually grew my competence in java. My theoretical knowledge grew fast, but competence is not really about IV questions. More about GSD know-how where instrumentation (including tool-chain) /knowledge/ is probably #1.

Big lesson -- get a job to learn portable skills. You don't want to do years of python programming but only in some proprietary framework.

Within finance, technology outlives most job functions

Look at these job functions --

* Many analysts in finance need to learn data analytics software ....
* Risk managers depend on large risk systems...
* Quants need non-trivial coding skill...
* Everyone in finance needs Excel, databases, and ... financial data.
.... while the IT department faces no threat, except outsourcing. Why?

Surely ... Financial data volume is growing
Surely ... Automation reduces human error, enforces control -- operational risk...
Computer capabilities are improving
Financial data quality is improving
Computers are good at data processing, esp. repetitive, multi-step...
Financial info tech is important and valuable (no need to explain), not simple, requires talent, training, experience and a big team. Not really blue-collar.

Many techies point out the organizational inefficiencies and suggest there's no need for so many techies, but comparatively is there a need for so many analysts, or so many risk managers, or so many accountants or so many traders? Every role is dispensable! Global population is growing and getting better educated, so educated workforce must work.

Saturday, July 11, 2015

keyboard accelerator to open c:/temp/a

Easy in win-xp- - Create an item on the start menu. The item is a link to the tmp file

--solution 1:
On win7, I had to name the file something like c:/temp/tmppp.txt. Then place a shortcut on Desktop outside any myShortcuts\ folder. Then right-click -> properties -> shortcut.

Ineffective when notepad++ is already running (possibly minimized to tray...) So the workaround is Alt-F4 to kill notepad++ then use the shortcut key

--Solution 2: drag shortcut "pin to start menu"

1. Open mswe and locate the file
2. Drag the file towards the start button until "pin to start menu"
3. Pin it to top of the list
4. From now on, we can wake up the start menu then arrow-down, blindfold

Saturday, June 27, 2015

dll late binding via LoadLibrary()

[[beginning vc++2010]] P1179 -- a host app can run for a while before loading a dll. I guess in the interim a user can replace the dll.

benefit -- smaller app footprint . All those large DLLs will only load into memory when needed, on-demand.

which programming languages changed toooo much@@

No elaboration please. Pointers will do.

perl 6?
python 3?  but the excessive changes are not taken up!
php5 was better received.
C# ?

MFC -> winforms -> WPF ? Different technologies. More like delphi/powerBuilder/coldFusion declining than a language evolving.

java 8? not much change. Lambda is the biggest
c++11? not much change. Rval ref is the biggest


java 8 default method breaking backward compatibility

--Based on P171 [[Mastering Lambdas]]

Backward compatibility (BWC) means that when an existing interface like Collection.java includes a brand new default method, the existing "customer" source code should work as before.

Default methods has a few known violations of BWC.

* simplest case: all (default) methods in an interface must be public. No buts or ifs.  Suppose MyConcreteClass has private m2() and implements MyIntf. What if MyIntf now gets a default method m2()? Compilation error!

* a more serious case: java overriding rule (similar to c++) is very strict (http://bigblog.tanbin.com/2011/02/runtime-binding-is-highly-restrictive.html) so m(int) and m(long) is always, automatically overloading not overriding.  Consider myObj.m(33). Originally, this binds to the m(long) declared in the class. Suppose the new default method is m(int). Overloads. This is a better match so selected by compiler (not JVM runtime). Breaks backward compatibility.

This excellent thin book gives 2 more cases...


overvalue ^ undervalued in tech IV - random picks



--overvalued – 10 items unranked:
problem solving with comp science constructs -- mostly algos
fast coding
code quality – in those take-home coding tests
corner cases
arch
OO design theories, principles

--undervalued practical skills:
stackoverflow type of know-how, including syntax details. These aren't quizzed in high-end interviews.
tools including IDEs
[T] bug hunting
[T] large code base reading
[T] reading lots of documentation of various quality to figure out why things not working

[T = hard to test in IV]



oldest programmer in a team, briefly

I'm unafraid of being the oldest programmer in a team, for example in China or Singapore, as long as I'm competent. If my foundation is strong and I know the local system well, then I will be up to the challenge.

Actually, Can be fun and stimulating to work with young programmers. Keeps me young. My experience might add unique value, if it's relevant.

I can think of many Wall St consultants in this category.

linker error in java, briefly

[[mastering lambda]] points out one important scenario of java linker error. Can happen in java 1.4 or earlier. Here's my recollection.

Say someone adds a method m1() to interface Collection.java. This new compiled code can coexists with lots of existing compiled code but there's a hidden defect. Say someone else writes a consumer class using Collection.java, and calls m1() on it. This would compile in a project having the new Collection.java but no HashSet.java. Again, this looks fine on the surface. At run time, there must be a concrete class when m1() runs. Suppose it's a HashSet compiled long ago. This would hit a linker error, since HashSet doesn't implement m1().


Tuesday, June 23, 2015

## once-valuable skills - personal experience

struts
perl - lost to python
tomcat, weblogic, jboss
apache, mysql
dns, unix network config
autosys - not used after GS
[$] sybase, oracle - after ML edge project I didn't need it again.
[$] VBA - used once only. But Excel is going strong!

[$ = still has value]

--random list (not "top 10") of longevity skills
eclipse, MSVS
Excel and add-in
STL, boost
javascript, though I don't use it in my recent jobs
python
Linux shell
vi
compiler skills
make, msbuild
bash and DOS batch scripting, even though powershell, vbscript and python/perl are much richer.

Sunday, June 21, 2015

BBG algo IV - locate a pair producing a desired sum


Q: Given an unsorted array of positive integers, is it possible to find a pair of integers from that array that sum up to a given sum?

Constraints: This should be done in O(n) and in-place without any external storage like arrays, hash-maps, but you can use extra variables/pointers.

If this is not possible, can there be a proof given for the same?
-----Initial analysis----
I wish I were allowed to use a hash table of "wanted " values. (iterate once and build hashtable. For Each new value encountered, check if it is in the "wanted" list...)

I feel this is typical of west coast algorithm quiz.

I feel it's technically impossible, but proof?  I don't know the standard procedure to prove O(n) is impossible. Here's my feeble attempt:

Worst case -- the pair happens to be the 1st and last in the array. Without external storage, how do we remember the 1st element?  We can only use a small number of variables, even if the array size is 99999999. As we iterate the array, I guess there would be no clue  that 1st element is worth remembering. Obviously if we forget the 1st element, then when we see the last element we won't recognize they are the pair.
-----2nd analysis----
If we can find a O(n) sort then problem is solvable. Let's look at Radix sort, one of the most promising candidates.

Assumption 1: the integers all have a maximum "size" in terms of digits. Let's say 32-bit. then yes radix is O(n). Now, with any big-O analysis we impose no limit on the sample size. For example we could have 999888777666555444333 integers. Now, 32-bit gives about 4 billion distinct "pigeon-holes", so by the pigeon-hole principle most integers in our sample have to be duplicates! Sounds artificial and unreasonable.

Therefore, Assumption 1 is questionable. In fact, some programming languages impose no limit on integer size. One integer, be it 32 thousand bits or 32 billion bits, could use up as much memory as there is in the system. Therefore, Assumption 1 is actually superfluous.

Without Assumption 1, and if we allow our sample to be freely distributed, we must assume nothing about the maximum number of digits. I would simply use

Assumption 2: maximum number of digits is about log(n). In that case radix sort is O(n log(n)), not linear time:(

Saturday, June 20, 2015

BUY a (low) interest rate = Borrow at a lock-in rate

Q: What does "buying at 2% interest rate" mean?

It's good to get an intuitive and memorable short explanation.

Rule -- Buying a 2% interest rate means borrowing at 2%.

Rule -- there's always a repayment period.
Rule -- the 2% is a fixed rate not a floating rate. In a way, whenever you buy you buy with a fixed price. You could buy the "floating stream" .... but let's not digress.

Real, personal, example -- I "bought" my first mortgage at 1.18% for first year, locking in a low rate before it went up.

factors affecting bond sensitivity to IR

In this context, we are concerned with the current market value (eg a $9bn bond) and how this holding may devalue due to Rising interest rate for that particular maturity.

* lower (zero) coupon bonds are more sensitive. More of the cash flow occurs in the distant future, therefore subject to more discounting.

* longer bonds are more sensitive. More of the cashflow is "pushed" to the distant future.

* lower yield bonds are more sensitive. On the Price/yield curve, at the left side, the curve is steeper.

(I read the above on a slide by Investment Analytics.)

Note if we hold the bond to maturity, then the dollar value received on maturity is completely deterministic i.e. known in advance. There are 3 issues with this strategy:

1) if in the interim my bond's MV drops badly, then this asset offers poor liquidity. I won't have the flexibility to get contingency cash out of this asset.

1b) Let's ignore credit risk in the bond itself. If this is a huge position (like 2 trillion) in the portfolio of a big organization (even for a sovereign fund), a MV drop could threaten the organization's balance sheet, credit rating and borrowing cost. Put yourself in the shoes of a creditor. Fundamentally, the market and the creditors need to be assured that this borrower could safely liquidity part of this bond asset to meet contingent obligations.

Fundamental risk to the creditors - the borrower i.e. the bond holder could become insolvent before bond maturity, when the bond price recovers.


2) over a long horizon like 30Y, that fixed dollar amount may suffer unexpected higher inflation. I feel this issue tends to affect any long-horizon investment.

3) if in the near future interest rises sharply (hurting my MV), that means there are better ways to invest my $9bn.


Friday, June 19, 2015

c++ parametrized functor - more learning notes

Parametrized Functor (class template) is a standard, recommended construct in c++, with no counterpart in java. C# delegae is conceptually simpler but internally more complex IMO, and represents a big upgrade from c++ functor. Better get some clarity with functor before comparing with delegates.

The #1 most common functor is the stateless functor (like a simple lambda). The 2nd common category is the (stateful) immutable functor. In all cases, the functor is designed for pass-by-value (not by ref or by pointer), cheap to copy, cheap to construct. I see many textbook code samples creating throwaway functor instances.

Example of the 2nd category - P127[[essentialC++]].

A) One mental block is using a simple functor Class as a template type param. This is different from java/c#

B) One mental block is a simple parametrized functor i.e. class template. C++ parametrized functor can take various type params, more "promiscuous" than c#/java.

C) A bigger mental block, combining A+B, is a functor parametrized with another functor Class as a template param. See P186[[EssentialC++]]. This is presented as a simple construct, with about 20 lines of code, but the more I look at it, the more strange it feels.

In java, we write a custom comparitor class rather than a comparitor class Template. We also have the simpler alternative of a Comparable interface, but that's not very relevant in this discussion. Java 8 lambda -- http://www.dreamsyssoft.com/java-8-lambda-tutorial/comparator-tutorial.php

lambda - replicable in c++/java/c#

I hope to find online articles to support each claim.

C# -- anon delegates. See [[c# in depth]]

java -- anon nested classes. See [[mastering lambdas]]

c++ -- functor class (template) or function pointer. See https://blog.feabhas.com/2014/03/demystifying-c-lambdas/ and http://www.cprogramming.com/c++11/c++11-lambda-closures.html

64-bit java -- my own mini Q&A

Q: will 32-bit java apps run in 64-bit JVM?
A: Write once, run anywhere. Sun was extremely careful during the creation of the first 64-bit Java port to insure Java binary and API compatibility, so all existing 100% pure Java programs would continue running just as they do under a 32-bit VM. However, non-pure java, like JNI, will break.
ficc acc
Q: 32bit apps need recompilation?
A: Unlike pure java apps, all native binary code that was written for a 32-bit VM must be recompiled for use in a 64-bit VM. All currently supported operating systems do not allow the mixing of 32 and 64-bit binaries or libraries within a single process.

Q: The primary advantage of running Java in a 64-bit environment?
A: larger address space. This allows for a much larger Java heap size and an increased maximum number of Java Threads.

Q: complications?
A: Any JNI native code in the 32-bit SDK implementation that relied on the old sizes of these data types is likely to require updating.
%%A: if java calls another program, maybe that program will need to be 64-bit compatible. This answer is slightly relevant.

Q: how is 32/64 bit JDK's installed?
A: Solaris has both a 32 and 64-bit J2SE implementation contained within the same installation of Java, you can specify either version. If neither -d32 nor -d64 is specified, the default is to run in a 32-bit environment. All other platforms (Windows and Linux) contain separate 32 and 64-bit installation packages.

nant.build Rebuild ^ build

In nant.build,

A) if a compilation is Rebuild, then it could be slow for a large solution. This is similar to MSVS solution rebuild

B) If a compilation is Build, then it can make use of whatever you have already built in MSVS, but if one of the projects was cancelled half-way it may come up as corrupted and break the nant build. I guess you can clean that project and rerun nant.

Sunday, June 14, 2015

ESC[.... showing up in LESS command output

Use "less -R" to view the file..
See http://unix.stackexchange.com/questions/78729/man-displaying-groff-control-characters

Saturday, June 13, 2015

"screen" command cheatsheet

For each feature, let's verify it and give links to only good web pages

Even after the "parent shell" of the screen is killed, the screen is still alive and "attached"!

--[ Essential] How to tell if I'm in a screen
http://serverfault.com/questions/377221/how-do-i-know-im-running-inside-a-linux-screen-or-not

--[ Essential] reattach
http://www.gnu.org/software/screen/manual/screen.html#Invoking-Screen

--feature
keep a session alive after I log out. Almost like windows mstsc.exe

--feature - log the screen output
See http://www.rackaid.com/blog/linux-screen-tutorial-and-how-to/
The log file is created in the current dir ...

--feature - Run a long running process without maintaining an active shell session
See http://www.rackaid.com/blog/linux-screen-tutorial-and-how-to/ -- using top as a long command

--(untested) feature - monitor a screen for inactivity as a sign of job done.
See http://www.rackaid.com/blog/linux-screen-tutorial-and-how-to/

teach your kid - interest in finance subjects, not just salary

If we focus on the salary, interviews .. then the learning interest won't last. In addition, it's worthwhile to develop interest in the subjects....

Some subjects in finance are dirty, shallow, pointless, superfluous (same in technology ;-). Some subjects are about gambling. I won't elaborate further. Instead, Identify those areas with depth and worth studying.

Financial math subjects have depth. However, some of those stochastic subjects feel like too theoretical and ivory tower, and too removed from reality to be relevant. I feel the statistics subjects are more practical.

There are many books ...

change of .. numeraire^measure

Advice: When possible, I would work with CoN rather than CoM. I believe once we identify another numeraire (say asset B) is useful, we just know there exists an equivalent measure associated with B (say measure J), so we could proceed. How to derive that measure I don't remember. Maybe there's a relatively simple formula, but very abstract.

In one case, we only have CoM, no CoN -- when changing from physical measure to risk neutral measure. There is no obvious, intuitive numeraire associated with the physical measure!

----
CoN is more intuitive than CoM. Numeraire has a more tangible meaning than "measure".

I think even my grandma could understand 2 different numeraires and how to switch between them.  Feels like simple math.

CoM has rigorous math behind it. CoM is not just for finance. I guess CoM is the foundation and basis of CoN.

I feel we don't have to have a detailed, in-depth grasp of CoM to use it in CoN.

beta definition in CAPM - confusion cleared

In CAPM, beta (of a stock like ibm) is defined in terms of
* cov(ibm excess return, mkt excess return), and
* variance of ibm excess return

I was under the impression that variance is the measured "dispersion" among the recent 60 monthly returns over 5 years (or another period). Such a calculation would yield a beta value that's heavily influenced or "skewed" by the last 5Y's performance. Another observation window is likely to give a very different beta value. This beta is based on such unstable input data, but we will treat it as a constant, and use it to predict the ratio of ibm's return over index return! Suppose we are lucky so last 12M gives beta=1.3, and last 5Y yields the same, and year 2000 also yields the same. We still could be unlucky in the next 12M and this beta
fails completely to predict that ratio... Wrong thought!

One of the roots of the confusion is the 2 views of variance, esp. with time-series data.

A) the "statistical variance", or sample variance. Basically 60 consecutive observations over 5 years. If these 60 numbers come from drastically different periods, then the sample variance won't represent the population.

B) the "probability variance" or "theoretical variance", or population variance, assuming the population has a fixed variance. This is abstract. Suppose ibm stock price is influenced mostly by temperature (or another factor not influenced by human behavior), so the inherent variance in the "system" is time-invariant. Note the distribution of daily return can be completely non-normal -- Could be binomial, uniform etc, but variance should be fixed, or at least stable -- i feel population variance can change with time, but should be fairly stable during the observation window -- Slow-changing.

My interpretation of beta definition is based on the an unstable, fast-changing variance. In contrast, CAPM theory is based on a fixed or slow-moving population variance -- the probability context. Basically the IID assumption. CAPM assumes we could estimate the population variance from history and this variance value will be valid in the foreseeable future.

In practice, practitioners (usually?) use historical sample to estimate population variance/cov. This is, basically statistical context A)

Imagine the inherent population variance changes as frequently as stock price itself. It would be futile to even estimate the population variance. In most time-series contexts, most models assume some stability in the population variance.

physical measure is impractical

Update: Now I think physical probability is not observable and utterly unusable in the math including the numerical methods.  In contrast, RN probabilities can be derived from observed prices.

Therefore, now I feel physical measure is completely irrelevant to option math.
---
RN measure is the "first" practical measure for derivative pricing. Most theories/models are formulated in RN measure. T-Forward measure and stock numeraire are convenient when using these models...

Physical measure is an impractical measure for pricing. Physical measure is personal feeling, not related to any market prices. Physical measure is mentioned only for teaching purpose. There's no "market data" on physical measure.

Market prices reflect RN (not physical) probabilities.

Consider cash-or-nothing bet that pays $100 iff team A wins a playoff. The bet is selling for $30, so the RN Pr(win) = 30%. I am an insider and I rig the game so physical Pr() = 80% and Meimei (my daughter) may feel it's 50-50 but these personal opinions are irrelevant for pricing any derivative.

Instead, we use the option price $30 to back out the RN probabilities. Namely, Calibrate the pricing curves using liquid options, then use the RN probabilities to price less liquid derivatives.

Professor Yuri is the first to point out (during my oral exam!) that option prices are the input, not the output to such pricing systems.


drift ^ growth rate - are imprecise

The drift rate "j" is defined for BM not GBM

                dAt = j dt + dW term

Now, for GBM,

                dXt = r Xt  dt + dW term

So the drift rate by definition is r Xt, Therefore, it's confusing to say "same drift as the riskfree rate". Safer to say "same growth rate" or "same expected return"

so-called tradable asset - disillusioned

The touted feature of a "tradable" doesn't impress me. Now I feel this feature is useful IMHO only for option pricing theory. All traded assets are supposed to follow a GBM (under RN measure) with the same growth rate as the MMA, but I'm unsure about most of the "traded assets" such as --
- IR futures contracts 
- weather contracts
- range notes
- a deep OTM option contract? I can't imagine any "growth" in this asset
- a premium bond close to maturity? Price must drop to par, right? How can it grow?


- a swap I traded at the wrong time, so its value is decreasing to deeply negative territories? How can this asset grow?

My MSFM classmates confirmed that any dividend-paying stock is disqualified as "traded asset". There must be no cash coming in or out of the security! It's such a contrived, artificial and theoretical concept! Other non-qualifiers:

eg: spot rate
eg: price of a dividend-paying stock – violates the self-financing criteria.
eg: interest rates
eg: swap rate
eg: future contract's price?


eg: coupon-paying bond's price

Black's formula isn't interest rate model, briefly

My professors emphasized repeatedly
* first generation IR model is the one-factor models, not Black model.
* Black model initially covered commodity futures
* However, IR traders adopted Black's __formula__ to price the 3 most common IR options
** bond options (bond price @ expiry is LN
** caps (libor rate @ expiry is LN
** swaptions ( swap rate @ expiry is LN
** However, it's illogical to assume the bond price, libor ate, and swap rates on the upcoming expiry date (3 N@FT) all follow LN distributions.

* Black model is unable to model the term structure. I would say that a proper IR model (like HJM) must describe the evolution of the entire yield curve with N points on the curve. N can be 20 or infinite...

mean reversion in Hull-White model

The (well-known) mean reversion is in drift, i.e. the inst drift, under physical measure.

(I think historical data shows mean reversion of IR, which is somehow related to the "mean reversion of drift"....)

When changing to RN measure, the drift is discarded, so not relevant to pricing.

However, on a "family snapshot", the implied vol of fwd Libor rate is lower the further out accrual startDate goes. This is observed on the market [1], and this vol won't be affected when changing measure. Hull-White model does model  this feature:


[1] I think this means the observed ED future price vol is lower for a 10Y expiry than a 1M expiry.

Friday, June 12, 2015

HJM, again

HJM's theory started with a formulation containing 2 "free" processes -- the drift (alpha) and vol (sigma) of inst fwd rate 

    

, both functions of time and could be stochastic.

Note the vol is defined differently from the Black-Scholes vol.
Note this is under physical measure (not Q measure). 
Note the fwd rate is instantaneous, not the simply compounded.

We then try to replicate one zero bond (shorter maturity) using another (longer maturity), and found that the drift process alpha(t) is constrained and restricted by the vol process sigma(t), under P measure. In other words, the 2 processes are not "up to you". The absence of arbitrage enforces certain restrictions on the drift – see Jeff's lecture notes.

Under Q measure, the new drift process [1] is completely determined by the vol process. This is a major feature of HJM framework. Hull-white focuses on this vol process and models it as an exponential function of time-to-maturity:
 

That "T" above is confusing. It is a constant in the "df" stochastic integral formula and refers to the forward start date of the (overnight, or even shorter) underlying forward loan, with accrual period 0.

[1] completely unrelated to the physical drift alpha(t)

Why bother to change to Q measure? I feel we cannot do any option pricing under P measure.  P measure is subjective. Each investor could have her own P measure.

Pricing under Q is theoretically sound but mathematically clumsy due to stochastic interest rate, so we change numeraire again to the T-maturity zero bond.

Before HJM, (I believe) the earlier TS models can't support replication between bonds of 2 maturities -- bond prices are inconsistent and arbitrage-able

vol, unlike stdev, always implies a (stoch) Process

Volatility, in the context of pure math (not necessarily finance), refers to the coefficient of dW term. Therefore,
* it implies a measure,
* it implies a process, a stoch process

Therefore, if a vol number is 5%, it is, conceptually and physically, different from a stdev of 0.05.

* Stdev measures the dispersion of a static population, or a snapshot as I like to say. Again, think of the histogram.
* variance parameter (vol^2) of BM shows diffusion speed.
* if we quadruple the variance param (doubling the vol) value, then the terminal snapshot's stdev will double.

At any time, there's an instantaneous vol value, like 5%. This could last a brief interval before it increase or decreases. Vol value changes, as specified in most financial models, but it changes slowly -- quasi-constant... (see other blog posts)

There is also a Black-Scholes vol. See other posts.

BS-vol - different from the "standard" vol

The "vol" in BS-Equ is the sigma: (Notice S and the GBM)

    

In other contexts, "vol" is defined as the y in

  

... which is more in line with the "variance parameter" of a BM, denoted v:

Thursday, June 11, 2015

various discount curves

For each currency
For each Libor tenor i.e. reset frequency like 3M, 6M
There's a yield curve

STIRT traders basically publish these curves via Sprite. Each currency has a bunch of tenor curves + the FX-OIS curve

This is the YC for the AA credit quality. In theory this yield curve is not usable for a different credit quality. For a BB credit quality, the mathematicians would, correctly, assume a yield curve but in reality I doubt there's a separate curve.

In contrast, there is indeed a tenor curve at 1Y, and other tenors too.

Basis swap means interest rate swap between 2 floating "floating streams".
* swap between 2 currencies
* swap between 2 Libor tenors
* swap between 2 floating indices. These curves below have different credit qualities:
** Libor -- AA banks
** OIS -- much lower credit risk given the short tenor
** T-bill -- US government credit

Tuesday, June 9, 2015

accumu lens: which past accumulations proved long-term@@

label - skillist

(There's a recoll on this accumulation lens concept.... )

Holy grail is orgro, thin->thick->thin..., but most of my attempts failed.

I have no choice but keep shifting. A focus on apache+mysql+php+javascript would leave me with few options.

--hall of fame
data structure theory + implementation in java, STL, c#? yes
[TD] core java (J2EE is churning)?

--also-ran
[!TC] bond math? Not really my chosen direction, so no serious investment
[!TC] option math?

SQL? yes but not a tier one skill like c++ or c#
SQL tuning? not much demand in the c++ interviews, but better in other interviews
[D] Excel + VBA? Not my chosen direction
[T] threading? Yes insight and essential techniques
C programming? not many roles
py?
[D] Unix instrumentation, automation
DBA? No. This is not my chosen direction

--strengths
C= churn rate is comfortable
D= robust demand
T= thin->thick->thin achieved

Radon-Nikodym derivative (Lida video)

Lida pointed out CoM (change of measure) means that given a pdf bell curve, we change its mean while preserving its "shape"! I guess the shape is the LN shape?

I guess CoM doesn't always preserve the shape.

Lida explained how to change one Expectation integral into another... Radon Nikodym.

The concept of operating under a measure (call it f) is fundamental and frequently mentioned but abstract...

Aha - Integrating the expectation against pdf f() is same as getting the expectation under measure-f. This is one simple, if not rigorous, interpretation of operating under a given measure. I believe there's no BM or GBM, or any stochastic process at this stage -- she was describing how to transform one static pdf curve to another by changing measure. I think Girsanov is different. It's about a (stochastic) process, not a static distribution.

discounted asset price is MG but "discount" means...@@

The Fundamental Theorem

A financial market with time horizon T and price processes of the risky asset and riskless bond (I would say a money-market-account) given by S1, ..., ST and B0, ..., BT, respectively, is arbitrage-free under the real world probability P if and only if there exists an equivalent probability measure Q (i.e. risk neutral measure) such that
The discounted price process, X0 := S0/B0, ..., XT := ST/BT is a martingale under Q.

#1 Key concept – divide the current stock price by the current MMA value. This is the essence of "discounting", different from the usual "discount future cashflow to present value"
#2  key concept – the alternative interpretation is "using MMA as currency, then any asset price S(t) is a martingale"

I like the discrete time-series notation, from time_0, time_1, time_2... to time_T.

I like the simplified (not simplistic:) 2-asset world.

This theorem is generalized with stochastic interest rate on the riskless bond:)

There's an implicit filtration. The S(T) or B(T) are prices in the future i.e. yet to be revealed [1]. The expectation of future prices is taken against the filtration.

[1] though in the case of T-forward measure, B(T) = 1.0 known in advance.


--[[Hull]] P 636 has a concise one-pager (I would skip the math part) that explains the numeraire can be just "a tradable", not only the MMA. A few key points:

) both S and B must be 2 tradables, not something like "fwd rate" or "volatility"
) the measure is the measure related to the numeraire asset
) what market forces ensure this ratio is a MG? Arbitragers!

T-fwd measure - #1 key feature

See also the backgrounder – blog post on discount asset prices and martingale

Q: Among all numeraires, which one has a known future time-T value?
A: The bond maturing at time T is the only numeraire I know. Therefore the measure behind this numeraire is uniquely useful.

HJM, briefly

* HJM uses (inst) fwd rate, which is continuously compounded. Some alternative term structure models use the "short rate" i.e. the extreme version of spot overnight rate. Yet other models [1] use the conventional "fwd rate" (i.e. compounded 3M loan rate, X months forward.)

[1] the Libor Mkt Model

* HJM is mostly under RN measure. The physical measure is used a bit in the initial SDE...

* Under RN measure, the fwd rate follows a BM (not a GBM) with instantaneous drift rate and instantaneous variance both time-dependent but slow-moving. Since it's not GBM, the N@T is Normal, not LG
** However, to use the market-standard Black's formula, the discrete fwd rate has to be LN


* HJM is the 2nd generation term-structure model and one of the earliest arbitrage free model. In contrast, the Black formula is not even an interest rate model.

Sunday, June 7, 2015

[[Hull]] is primarily theoretical

[[Hull]] is first a theoretical / academic introductory book. He really likes theoretical stuff and makes a living on the theories.

As a sensible academic, he recognizes the (theory-practice) "gaps" and brings them to students' attention. but I presume many students have no spare bandwidth for it. Exams and grades are mostly on the theories.

bonds - relevant across FI models (HJM..

Bonds are no longer the dominant FI instrument, challenged by IRS and futures. However, I feel for Fixed Income models bonds are more important, more useful than any other instrument.

- Bond (unlike swap rates, FRA rates etc) is a tradeable, and obeys many nice martingale pricing rules.

- Zero Bond can be a numeraire.

- For model calibration, zero bond prices are, IMO, more easily observable than swap rates, FRA rates, cap prices, swaption prices etc. I think futures prices could be more "observable" but the maturities available are limited.

- Zero bond's price is exactly the discount factor, extremely useful in the math. I believe a full stochastic model built with zero bond prices can recover fwd rates, spot rates, short rates, swap rates and all others ...

- I believe term structure models could be based on fwd rate or short rate, but they all need to provide an expression for the "zero bond process" i.e. the process that a given bond's price follows. Note this process must converge to par at maturity.

- Bond as an instrument is closely related to caps and floors and callable bonds.

- Bonds are issued by all types of issuers. Other instruments (like swaps, IR futures) tend to have a smaller scope.

- Bonds are liquid over most regions of the yield curve, except the extreme short end.

why martingale so welcome -- simply stated

label – bold

P645 [[Hull] hints that if we can perceive an (unfolding) price process as a MG, then it will be much simpler to price a contingent claim.

In a similar sense, if a contingent claim has only a single payoff at a future time T, then the T-forward measure can simplify things significantly.

Total Pageviews

my favorite topics (labels)

_fuxi (302) _misLabel (13) _orig? (3) _rm (2) _vague (2) clarified (58) cpp (39) cpp_const (22) cpp_real (76) cpp/java/c# (101) cppBig4 (54) cppSmartPtr (35) cppSTL (33) cppSTL_itr (27) cppSTL_real (26) cppTemplate (28) creditMkt (14) db (65) db_sybase (43) deepUnder (31) dotnet (20) ECN (27) econ/bank` (36) fin/sys_misc (43) finGreek (34) finReal (45) finRisk (30) finTechDesign (46) finTechMisc (32) finVol (66) FixedIncom (28) fMath (7) fMathOption (33) fMathStoch (67) forex (39) gr8IV_Q (46) GTD_skill (15) GUI_event (30) inMemDB (42) intuit_math (41) intuitFinance (57) javaMisc (68) javaServerSide (13) lambda/delegate (22) marketData (28) math (10) mathStat (55) memIssue (8) memMgmt (66) metaProgram` (6) OO_Design (84) original_content (749) polymorphic/vptr (40) productive (21) ptr/ref (48) py (28) reflect (8) script`/unix (82) socket/stream (39) subquery/join (30) subvert (13) swing/wpf (9) sysProgram` (16) thread (164) thread_CAS (15) thread_cpp (28) Thread* (22) timeSaver (80) transactional (23) tune (24) tuneDB (40) tuneLatency (30) z_ajax (9) z_algoDataStruct (41) z_arch (26) z_arch_job (27) z_automateTest (17) z_autoTrad` (19) z_bestPractice (39) z_bold (83) z_bondMath (35) z_book (18) z_boost (19) z_byRef^Val (32) z_c#GUI (43) z_c#misc (80) z_cast/convert (28) z_container (67) z_cStr/arr (39) z_Favorite* (8) z_FIX (15) z_forex (48) z_fwd_Deal (18) z_gz=job (33) z_gzBig20 (13) z_gzMgr (13) z_gzPain (20) z_gzThreat (19) z_hib (19) z_IDE (52) z_ikm (5) z_IR_misc (36) z_IRS (26) z_javaWeb (28) z_jdbc (10) z_jobFinTech (46) z_jobHunt (20) z_jobRealXp (10) z_jobStrength (15) z_jobUS^asia (27) z_letter (42) z_linq (10) z_memberHid` (11) z_MOM (54) z_nestedClass (5) z_oq (24) z_PCP (12) z_pearl (1) z_php (20) z_prodSupport (7) z_py (31) z_quant (14) z_regex (8) z_rv (38) z_skillist (48) z_slic`Problem (6) z_SOA (14) z_spring (25) z_src_code (8) z_swingMisc (50) z_swingTable (26) z_unpublish (2) z_VBA/Excel (8) z_windoz (17) z_wpfCommand (9)

About Me

New York (Time Square), NY, United States
http://www.linkedin.com/in/tanbin