All posts by oksure

Quick Follow-up on the Effectiveness of Cross-Posting

This is a quick follow-up report on my quick test about the effectiveness of cross-posting from WordPress to Medium posted a few days ago.

I have been observing the stats on both the Medium post and my blog post for a few days. I conclude that it is an okay time to give a quick report on what I learned from this testing.

Medium gives you exposure. 20% of people visited my blog.

Medium Stats

Over the past 4-5 days, 59 people viewed the article on Medium. 80% of them actually read it through and 4 people favorited. Out of four, one is me. To me, 59 viewers were quite impressive number given that my Medium account has been dormant for years. 80% reading ratio is also probably high, but the article itself is quite short anyway. The number of daily viewers changes like [3, 37, 12, 4, 3]. I then compared it with Google Analytics. Over the same period, the [1, 6, 2, 1, 2]. Although the referral information was not captured in my current Google Analytics, I suspect that these two trends resemble each other. Based on this observation, I conclude for now that about 20% of people who read my article on Medium will visit my blog.

There are a few things to think about further. Will this conversion rate increase or decrease as my readership grows on Medium? My guess is it will decrease as only new readers will visit my blog out of curiosity and old readers will have no reason to do so. I guess for popular Medium writers the ratio will be even below 10%.

Time lag is another dimension to consider. As Sarah Cooper mentioned in her comment, if I post an article on Medium with a delay of a week, what will happen. I don’t think it will affect the click-through rate to my blog per se. But, it may influence search engine results in case Google thinks my blog primarily carries duplicate contents only. First of all, I think Google is smart enough. Second, there seem to be some work-around for this type of cross-posting across blogs. The Medium WP plugin in fact adds “canonical” link header in their page, so I am not worrying about timing too much right now.

Mention some people that you want to get along on Medium.

One thing I was very happy about this process is that Sarah Cooper was the first person who commented on my post! This is exactly what I like about joining a new platform in its early days. You can engage with people that you will not hear back from on Facebook or Twitter. If I were to start following President Obama from Twitter today, I would be one of his 76 million followers. At Medium, I would be one of 12k follower. Ignore both numbers are practically zero. The difference is of a few orders of magnitude. To mention in your writing some people that are active in Medium and that you are interested in getting along is a good way to keep you motivated and also boost the initial exposure.

Having some initial followers must have helped.

I also had about 80 mutual followers imported from other social media–Twitter I think–when I first created an account at Medium. Having those initial followers must have helped as well. My advice is to start following some people on Medium and reading what they write. The more I spend time with reading, the more I feel embedded in this growing community. Activities like following, reading, commenting, liking, etc. will help bootstrap the initial set of followers.

In summary, it was an interesting test for me and I hope it can shed a light on how traffic looks like when you first start blogging and using Medium. Although I am by no means a professional blogger, I look forward to meeting and interacting with people at Medium on occasion.

Effectiveness of Cross-Posting from WordPress to Medium

As I was preparing to restart my WordPress blog, I wondered what syndication options I could have for new posts. I synced my blog with Twitter and Facebook a few years ago, but I stopped updating Twitter and Facebook for a while. I learned about Medium a few years ago and I always thought I would try out the platform one day. And, today is the day.

Cross-posting from WordPress to Medium

Once I decided to try Medium, a few thoughts crossed my mind.

  • Is posting duplicate articles on both WP and Medium allowed?
    The key difference of Medium from Twitter and Facebook as a syndication channel is that it is a full-fledged platform dedicated to online publication. When I used Twitter and Facebook, they served mostly as a notification system: notifying my followers about a new post and the link to it. At Medium, on the other hand, the norm seems to be posting the entire article on the platform. This may create a SEO problem. From googling, some people recommend publishing on Medium a few days later than your original post is out on WordPress. Others recommend repurposing the article specifically targeting the Medium audience. For now, I will rather publish an almost identical post on Medium and see what happens. Still, I will selectively send articles to Medium. Not all short notes need to go to Medium.

  • Who is already doing it?
    There seem to be quite a number of people using Medium as a syndication platform for their WordPress blog. One example (biznology), another example (Sarah Cooper), another (Jared Stein). It is hard to find the list of all WordPress bloggers who send their posts to Medium, but many people seem to be using Medium as a content distribution platform for sure. Maybe I will be able to meet some people doing the same thing there once I start using Medium.

  • Is there a WordPress plugin?
    In fact, there is a seemingly official WordPress plugin from Medium. I had to go through a few steps before getting it working like putting in the access token, but overall the process was very straightforward. I will see if it works by publishing this post. If you are reading this article on Medium, it means the plugin worked at least for me.

A reason for this post: Testing the water

One reason of this inaugural post is to see how many people actually visit my WordPress blog through Medium. Right now, both my blog and Medium account are in a clean slate. This blog is (almost) blank and my Medium account is literally blank. Using Google Analytics, perhaps I can check where people are coming from. Once I get to have some sizable statistics, I will write about it.

The metrics that I am interested in right now are as follows:

  • Side-by-side comparison in the number of viewers
  • How many people visit this WordPress blog from Medium

Please leave a comment (only on Medium) about what you are interested in. I will try to incorporate those requests when I get to write about it.

Geocode batch conversion + latitude, longitude, formatted address

I recently worked on a project that required getting latitude and longitude based on unstructured addresses. My advisor found me a website that run this conversion in batch and it turned out that website was quite decent.

Looking at its javascript a bit, they seem to be using Google’s API. So, if you just put in any piece related to something’s location, it will do its best to convert into normalized address, and latitude-longitude coordinates. Here’s an example.

"original address","returned address",latitude,longitude,accuracy,status code
"georgia tech","Georgia Institute of Technology, North Ave NW, Atlanta, GA 30332, USA",33.775618,-84.396285,3,200
"seoul","Seoul, South Korea",37.566535,126.977969,3,200
"white house","The White House, 1600 Pennsylvania Avenue Northwest, Washington, DC 20500, USA",38.897676,-77.03653,3,200

It seems to be a pretty useful research tool for me as I don’t have to code for myself to get to the Google’s geocoding API.

How I follow things these days — chain reaction of discovery

Since having graduated from ischool, I have not followed the trends in the tech startup space much for years taking classes doing research all good stuffs. Many things seeminly have happened: crunchbase started and some new language and frameworks (node.js, golang) have appeared.

Recently, I signed up to receive daily newsletter from crunchbase. It sends me digest of startups that got funded. Skimming through them one by one is becoming my daily habbit now. You can browse the archive of their newsletter from here:

Then, a few days ago I found this new service called producthunt. It’s basically daily tournament of new products and services.

One thing I recently found was drop. You can take a look yourself.

Obviously cool things are coming to the market every day and it’s quite amazing to watch them almost real time.

Kaffeine: prevent heroku app from sleeping

Most apps on the cloud are sleeping as the platform provider idles an app if there’s not a request for a certain amount of time. Heroku’s threshold is 1 hour. I found several discussions on the web:

I first tried adding new relic to my heroku app, but it seems an overkill and was a bit involved to get it work.

Then, I found this “Kaffeine” and it worked for me. One limitation is that it’s only for heroku apps.

For other cloud providers, here are a few alternatives I found.

Leveraging the color palette used by Google Visualization

I have used colors from the Google Visualization default palette for several projects. You can quickly generate all colors using a simple code below and pick those colors and save for yourself.

If you don’t have a way to extract RGB codes, here it is. The order is blue, red, orange, green, purple, and so on. Although I giving you about 20 colors here, I personally try not to go beyond 5-6 colors at the most. The colors in this list beyond that threshold seem to start repeating itself.

Circos Data Format Explained


Circos is a visualization tool that draws network in a circular fashion. To my experience and best knowledge, it is the richest medium in which a network can be shown and data can be visually encoded.

What this post is NOT about

Circos is not the most user-friendly visualization tool on earth. For those looking for installation help, you’ve got a wrong number. Here are a few recommended reading that I referred when I had some problems in installation of circos.

Making sure all parts of circos are downloaded and properly loaded is just painful. I guess more than half of people who were fascinated by a circos graphic would give up trying to install it on their machine. It was just that hard for me.

Once you successfully install and are able to produce a graphic following offical tutorials provided by circos, you will be amazed by the comprehensive coverage of the official tutorial. However, the problem with having a comprehensive set of tutorials is that you cannot easily find a way to convert your traditional network viz into one of the cool circos viz—both conceptually and technically.

This post is intended for those who 1) have installed circos, 2) have produced some graphics following its tutorial, and 3) now want to plug your own data into the circos format. It’s my attempt to document the way I understand how one can transform a usual network visualization into a circos visualization.

Anatomy of a circos visualization

Before getting started, I’d like to emphasize that a circos visualization has different naming conventions for its parts. This made it hard for me to understand what their tutorial meant from the beginning. So, first off, I recommend you skim through the following nice summary slides on anatomy of circos graphics.

From the figure above, remember four elements: (B) ideogram, (H) ticks, (F) highlights, and (E) links. Ideogram means the circular arc segments around a big circle with some thickness. Ticks show units of viz. Highlights are meant to emphasize a certain part of an arc. Lastly, links are connection between arcs.

How circos is conceptually different from the usual network visualization

If I were to create a usual network visualization of two-node graph, it would look like this.

Circos can visualize this kind of relationship for sure, but it is capable of doing the job for much more complex relationships. For example, suppose the two-node graph we saw above is now a multi-graph, i.e., a pair of nodes can have more than one edges. The figure below shows this network. Nodes 1 and 2 now have three edges between them with varying weights.

If you share some sense of aesthetics with me, you realize it’s ugly—more important, it’s arbitrary—and there must be a better way to deal with this sort of situation. And, circos is the one.

Understanding circos data format

Initial purpose of circos was to visualize relationship among chromosome in genes. Look at some of these wiki pages to see if it helps.

You may think its origin doesn’t really matter as long as it works to solve your problem. But, the problem is that circos documentation explains things using these biology jargons—karyotype, chromosome—which I think hinders understanding of general audiences.

After some hours of struggling, I devised my own way of interpreting the biological concepts built in circos. First of all, a chromosome is a node. So, you need to prepare a file that contains the list of all nodes. Suppose nodes 1 and 2 are US and China, respectively, and you are trying to visualize some trades between them. The first thing you need is something like this.


chr - usa USA 0 2000 myblue
chr - chn CHINA 0 1000 myred

Let me explain one by one.

  • Two lines: We will have two nodes in our viz.
  • Every line starts with “chr – “: It’s just a convention denoting that this line describes a node (i.e., a chromosome).
  • “usa” / “chn”: node id
  • “USA” / “CHINA”: node labels
  • 0 to 2000 / 0 to 1000: node size (i.e., start and end position). The USA node is of size 2000 and the CHINA node is of size 1000. Note that circos only accepts integer as its positioning parameter.
  • The last element of each line denotes node color. I will explain how to define your own color in the next section. Here we focus on setting up data files in the right format.

Now that you prepared the list of nodes, let us get to list of edges. Edges are called “links” in circos. Recall the example of two-node multi-graph above. Suppose we are trying to implement three edges between USA and CHINA.


usa 200 500 chn 100 250 color=myblue_transparent
usa 700 900 chn 500 600 color=myred_transparent
usa 1200 1300 chn 800 850 color=myblue_transparent

Each line is formatted as “node1_id node1_start node1_end node2_id node2_start node2_end color=mycolor”. Note that a pair of nodes can have multiple edges and each edge occupies different part of each node. This will be more evident in the final graphic.

We have prepared all the basic elements so far. These two files are just bare bone. However, circos actually provides many more charting functionalities which I cannot go over in this post. Let me show you how to play around with one of them here. Suppose you want to highlight some parts of the nodes with different color. Then, you prepare the following file in addition to nodes.txt and edges.txt.


usa 100 700 fill_color=myred
usa 700 1300 fill_color=myblue
usa 1300 1900 fill_color=myred
chn 100 500 fill_color=myblue
chn 500 900 fill_color=myred

Structure of this highlight file will become self-evident when we see the output visualization.

Putting all together into a visualization

At the heart of every circos visualization is the configuration file. A config file contains a list of commands (or directives) you want the viz engine to perform. Your custom definition of color, font, and placement all go into the config file. Now that you have all three data files ready—nodes.txt, edges.txt, and highlights.txt, you just need to invoke these files using the right language in the config file. Let’s say your main configuration file is named “usachn.conf” under “etc/” folder and data files reside in “data/usachn/” folder.

First, read in your nodes by this.

karyotype = data/usachn/nodes.txt

Your edges are read by this.

file = data/usachn/edges.txt
ribbon = yes
flat = yes
radius = dims(ideogram,radius_inner)-30
bezier_radius = 0r

“ribbon” and “flat” should be set “yes” in order to make circos render edges as defined in our data file: edges.txt. “radius” determines where links start. In this case, edges are drawn 30 pixel inside of the inner circle of ideogram. (Ideogram means the circular ring of nodes.) “bezier_radius” determines curvature of the edges.

Highlights are called in as follows.

type = highlight
file = data/usachn/highlights.txt
r0   = dims(ideogram,radius_inner)-5-15
r1   = dims(ideogram,radius_inner)-5
stroke_color = dgrey
stroke_thickness = 0p

Output file destination is put in as follows.

<<include etc/image.conf>>
file* = circos-usachn.png

Lastly, your custom colors (“myblue” and “myred”) are defined in the configuration file as follows.

myblue = 0, 0, 255
myred = 255, 0, 0

myblue_transparent = 0, 0, 255, .5
myred_transparent = 255, 0, 0, .5

For each line of custom color definition, the first three elements are R, G, B, and the fourth optional field is for alpha (transparancy). 0 is fully opaque and 1 is fully transparent.

The full configuration file can be viewed and downloaded here. You run circos using this config file in the command line as follows.

circos -conf etc/usachn.conf

And finally the resulting visualization will look like this.

It may not be as fancy as what you saw on the internet, but you probably have a better idea by now on how circos interpret your commands and data. You can play around with some of the parameters in the config file to see which setting leads to which output feature.


Circos is just impressively rich medium for visualization. It provides tons of other visualization elements such as histogram or 2D plots. Admittedly, I don’t know everything circos offers. But, when I struggled with the conceptual aspect of circos, I couldn’t find a simple to-go example on the web. All documentations and even tutorials seem very archaic to me. (Now I understand them better probably.) So, I decided to write one. Hope it helps!

A Model Blog for Research Digest

I came across a blog that can be a model of what I am envisioning for my small blog. It’s called BPS Research Digest. Although it is a authoritative blog published by the British Psychological Society and my blog is just by me, I would like to use this space to review the paper I found interesting in the manner that BPS presents.

The post that led me into the blog is this. I quickly analyzed the structure of the writing to have a guideline for myself.

From reading this post, a general structure I could write is as follows.

  1. Introduce with a commonly known story that could be potentially the motivation of the research. In this case, the author mentions a common human error in judgment called the Gambler’s Fallacy. (about 150 words)
  2. Explain what the authors actually did in the paper. Experiment? Modeling? Secondary data analysis? Highlight the main contribution only even if the authors have done many things for the whole paper. (about 100 words)
  3. Insert a picture in the middle if it can be helpful. (optional; Search Creative Commons images from flickr here)
  4. Summarize the results. (about 200 words)
  5. Summarize the authors’ interpretation on the results. (about 150 words)
  6. Conclude by going back to the introductory theme used in the introduction—the gambler’s fallacy in this case. (about 100 words)

If I can write this way, this alone will amount to 600-700 words. I hope it will be a good exercise for me to learn how to shape a good research question.

Admitted to Ph.D. Candidacy

Today I became a Ph.D. candidate! I presented the dissertation proposal last Friday and today. I have spent about a month for this proposal. Scheduling was one of the toughest things to do, and that’s why I proposed twice. (One should not do it twice.)

Overall, this official milestone of my Ph.D. study has prepared myself to better frame and position the work I have been doing. In fact, I realized that a dissertation might be slightly different from a research paper to be published at an academic journal. The committee looks for a theory that can be as generalizable as possible and as applicable as possible to multiple contexts, while a research paper might want to be very specific on admitting limitations of the work. This actually made me think how I can and should generalize the findings or hypotheses into other contexts such as automobile, ship-making, etc. This is quite a different task than I have been doing. It will be challenging but I find it must-have for a dissertation.

In addition to proposal, I presented a few presentations at POMS over last weekend. In total, I pitched four presentations over four days. It was very exhausting, but also an intensive learning period. The most important thing I think I learned out of this presentation spree is the importance of storyline. Having a good narrative always helps. When I do research, often I am so focused that I lose the big picture on what I am doing and trying to say. In that sense, creating a powerpoint deck for the paper I want to present helps me stay in the consistent storyline and understand the key contributions of my own work.

During the last week, I have missed the 500-word quota. As I restart my routine workload, I need to get back to the quota again. This blog post has 308 words.

Plan for CV page

After finishing the dissertation proposal defense, I will start working on my academic website. The design requirements I am considering are:

  • Ease of update: I want to keep my CV up-to-date in Markdown format and convert it into a part of web page as well as pdf. I am not sure if I will have enough time
  • Clean layout: I don’t want to throw in all fancy modern CSS techniques into a CV page. However, I do want to have it cleanly display at different devices. I will keep interactive elements at the minimum.
  • Single index.html file: so that I can easily migrate things When I get a faculty job at another school.