<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>semanticvoid &#187; Project</title>
	<atom:link href="http://semanticvoid.com/blog/index.php/category/project/feed/" rel="self" type="application/rss+xml" />
	<link>http://semanticvoid.com/blog</link>
	<description>extracting the semantics from the void</description>
	<lastBuildDate>Thu, 22 Sep 2011 21:05:48 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.4</generator>
		<item>
		<title>Reading Less Is Reading More</title>
		<link>http://semanticvoid.com/blog/2009/10/07/reading-less-is-reading-more/</link>
		<comments>http://semanticvoid.com/blog/2009/10/07/reading-less-is-reading-more/#comments</comments>
		<pubDate>Wed, 07 Oct 2009 08:19:27 +0000</pubDate>
		<dc:creator>Anand Kishore</dc:creator>
				<category><![CDATA[Natural Language Processing]]></category>
		<category><![CDATA[Project]]></category>
		<category><![CDATA[dygest]]></category>

		<guid isPermaLink="false">http://semanticvoid.com/blog/?p=363</guid>
		<description><![CDATA[If information is what drives you to the internet, like me, you might be spending roughly 60-70% of your time online reading blogs, news and feeds (not to forget twitter). For me at least, reading online has superseded email (and updating social networks) as the most time consuming activity. And yet everyone is busy generating [...]]]></description>
			<content:encoded><![CDATA[<p>If information is what drives you to the internet, like me, you might be spending roughly 60-70% of your time online reading blogs, news and feeds (not to forget twitter). For me at least, reading online has superseded email (and updating social networks) as the most time consuming activity. And yet everyone is busy generating more content rather than finding a solution to consume all this information. We are trying to tackle this problem precisely with <a href="http://dyge.st">Dygest</a>. At its core <a href="http://dyge.st">Dygest</a> is a summarization engine that tries to sift through all the noise and present only the *real* content/news contained in any (news) article/text. Recently, we released an experimental version of a feed summarizer that uses the <a href="http://dyge.st">Dygest</a> engine to summarize blogposts/news for any RSS/ATOM feed. This summarized feed can be subscribed in any feed reader like Bloglines, Google Reader etc.</p>
<p><strong>NOTE</strong>: A feed that has not been encountered by our system ever before should be summarized in a couple of minutes.</p>
<p><center><img src="http://farm4.static.flickr.com/3423/3988948493_63da2cb1bd_o.png" alt="Feed Summarizer" /></center></p>
<p>On the whole with Dygest, reading blogs has now become much faster, much more concise and consuming information has become a great deal easier. Imagine the time saved reading the summarized version as compared to the original post (also you are not overwhelmed with useless information). See for yourself below:</p>
<p><center><img src="http://farm3.static.flickr.com/2594/3989711414_1f28fd59bd.jpg" alt="Original Post"/></p>
<p><strong>Original Post</strong></center></p>
<p>
<center><img src="http://farm3.static.flickr.com/2600/3988953559_d203feb1b6.jpg" alt="Summarized Post"/></p>
<p><strong>Summarized Post</strong></center></p>
<p>While you might have the urge to head over to Dygest and summarize your entire subscription list on Google Reader, I would recommend reading this post a bit further for some real cool stuff we have in store. If you must though &#8211; <a href="http://dyge.st">click here to Dygest</a>.</p>
<p><strong><br />
<h3>Summarizing Your Twitter Links</h3>
<p></strong></p>
<p><a href="http://readtwit.com">Readtwit</a> is a really cool service launched recently, which extracts links from your twitter feed and packages them in a clean RSS format. The awesome combination of Readtwit along with Dygest yields a summarized twitter feed delivered to your favorite feed reader.</p>
<p>Steps to get a summarized twitter feed:</p>
<p>(1) Sign into <a href="http://readtwit.com">Readtwit</a>.<br />
(2) Copy the link on the &#8216;Get me the feed&#8217; button:<br />
<center><img src="http://farm3.static.flickr.com/2454/3989734546_db979a08f5_m.jpg"/></center><br />
(3) Paste this link into the <a href="http://dyge.st">Dygest</a> interface and subscribe to the summarized feed returned in your favorite feed reader.<br />
<center><img src="http://farm3.static.flickr.com/2473/3988983827_57010939ff_o.png"/></center></p>
<p><strong><br />
<h3>More To Come</h3>
<p></strong></p>
<p>This is just an experimental release of <a href="http://dyge.st">Dygest</a> and so do send in your feedback on the summaries and help us improve. In the coming months we are working on improving the algorithms and churning out other great applications of <a href="http://dyge.st">Dygest</a> (there is something really cool in the works). So while we are busy teaching computers to read, <a href="http://dyge.st">Dygest</a> your feeds &#8211; because reading less is reading more.</p>
<p>Follow us on twitter &#8211; <a href="http://twitter.com/dygest">@dygest</a></p>
]]></content:encoded>
			<wfw:commentRss>http://semanticvoid.com/blog/2009/10/07/reading-less-is-reading-more/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Dygest Your Search</title>
		<link>http://semanticvoid.com/blog/2009/03/19/dygest-your-search/</link>
		<comments>http://semanticvoid.com/blog/2009/03/19/dygest-your-search/#comments</comments>
		<pubDate>Fri, 20 Mar 2009 06:56:36 +0000</pubDate>
		<dc:creator>Anand Kishore</dc:creator>
				<category><![CDATA[Hacking]]></category>
		<category><![CDATA[Natural Language Processing]]></category>
		<category><![CDATA[Project]]></category>
		<category><![CDATA[Search]]></category>
		<category><![CDATA[Web]]></category>
		<category><![CDATA[Yahoo!]]></category>
		<category><![CDATA[Add new tag]]></category>
		<category><![CDATA[summarization]]></category>

		<guid isPermaLink="false">http://semanticvoid.com/blog/?p=256</guid>
		<description><![CDATA[Update: This hack won the coveted &#8216;Search&#8217; category award. For the last couple of days, I and @sudheer_624 have been busy working on this hack for a Yahoo! Hackday. Although still a prototype, the hack has turned out to be interesting so we thought of putting it out for others to play around with. Dygest [...]]]></description>
			<content:encoded><![CDATA[<p><strong>Update:</strong> This hack won the coveted &#8216;Search&#8217; category award.</p>
<p>For the last couple of days, I and <a href="http://twitter.com/sudheer_624">@sudheer_624</a> have been busy working on this hack for a Yahoo! Hackday. Although still a prototype, the hack has turned out to be interesting so we thought of putting it out for others to play around with.</p>
<p><strong>Dygest</strong> (pronounced as &#8216;digest&#8217; &#8211; thanks to <a href="http://twitter.com/bluesmoon">@bluesmoon</a>) is aimed at changing the conventional way of displaying search context via a snippet to a more informative, machine generated document summary. There two kinds of relevance for evaluating search results:</p>
<ul>
<li>Vertical relevance: determined by the ranking algorithms.</li>
<li>Horizontal relevance: the contextual information made available to the user about the result &#8211; Searchmonkey is a good initiative on this front.</li>
</ul>
<p>
The current way of displaying this context is via a snippet of text under every result. This snippet shows the neighborhood of the occurrence of the query terms. Usually this information is not rich enough for a searcher to make the right judgement about the result. This causes the searcher to switch back and forth between the documents and the search results if the the page is not relevant. This can be frustrating at times.</p>
<p>
<strong>Dygest</strong> aims to solve this by either replacing or enhancing the current search snippet with a summary of the result page. At its core lies a summarization engine which figures out what the *real* content of the page is (distinguishing it from the other junk like surrounding text, navigational text, comments etc) and then performs text summarization on this content. The summary of the page is then displayed to the user via the appropriate interface. How cool is that?</p>
<p>
The user no longer needs to click on irrelevant links. He/She can perceive the theme/important facts of the page from right within the results page. The other advantage of this is that it gives the user a good overview of the query topic &#8211; he no longer needs to spend time reading many long documents but rather read a few summaries from the top results to get a good overview of the subject. This is particularly well suited for mobile devices where its frustrating to switch back and forth between pages and the search results. This is also fit for news articles where we just need the important facts about the story. </p>
<p>
Well, here is an example to convince you. A search for &#8216;Carol Bartz&#8217; yields the following result which at the first glance is not at all informative.</p>
<p><center> <img alt="" border="2" src="http://farm4.static.flickr.com/3456/3369960208_48edc07644_o.png" title="search snippet for Carol Bartz" /> </center></p>
<p>
Enhancing the existing view with an abstract of the page helps gauge the content and theme of the document. This would now look like:</p>
<p><center> <img alt="" src="http://farm4.static.flickr.com/3637/3369975750_f0b313ae61_o.png" title="summarized view" /> </center></p>
<p><strong>Dygest</strong> outputs the following summaries for the query &#8216;<a href="http://datacracy.info/cgi-bin/dygest/search.py?q=iran+site%3Anews.yahoo.com">Iran</a>&#8216; restricted to Yahoo! News:</p>
<p><center><img alt="" src="http://farm4.static.flickr.com/3658/3370011200_a757dc42d8_o.png" title="Query for Iran" /></center></p>
<p>And following for &#8216;<a href="http://datacracy.info/cgi-bin/dygest/search.py?q=obama+stimulus+plan">Obama stimulus plan</a>&#8216;:</p>
<p><center><img alt="" src="http://farm4.static.flickr.com/3578/3370098322_1a73cd285b_o.png" title="obama stimulus plan"  /></center></p>
<p>Currently, <strong>Dygest</strong> has two interfaces &#8211; (1) a search interface powered by yahoo boss and (2) a searchmonkey plugin. Its just a prototype so be kind and don&#8217;t be too judgmental.</p>
<p>Start dygest<em>ing</em> <a href="http://datacracy.info/dygest/">here</a>.</p>
<p><center><br />
<script src="http://pipes.yahoo.com/js/imagebadge.js">{"pipe_id":"3hCWTB0Y3hG3E9xK6ycw5g","_btype":"image"}</script><br />
</center></p>
]]></content:encoded>
			<wfw:commentRss>http://semanticvoid.com/blog/2009/03/19/dygest-your-search/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Glimpse &#8211; visualizing your browsing history</title>
		<link>http://semanticvoid.com/blog/2008/07/19/glimpse-visualizing-your-browsing-history/</link>
		<comments>http://semanticvoid.com/blog/2008/07/19/glimpse-visualizing-your-browsing-history/#comments</comments>
		<pubDate>Sun, 20 Jul 2008 06:05:41 +0000</pubDate>
		<dc:creator>Anand Kishore</dc:creator>
				<category><![CDATA[Artificial Intelligence]]></category>
		<category><![CDATA[LifeLogger]]></category>
		<category><![CDATA[Patterns]]></category>
		<category><![CDATA[Project]]></category>
		<category><![CDATA[Research]]></category>
		<category><![CDATA[Trends]]></category>
		<category><![CDATA[Web]]></category>

		<guid isPermaLink="false">http://semanticvoid.com/blog/2008/07/19/glimpse-visualizing-your-browsing-history/</guid>
		<description><![CDATA[I started working on my second weekend project, guess I&#8217;ll do something small every week. This one is an extension to LifeLogger. The aim is to analyze ones daily and weekly browsing history and extract themes which could aid in recommendations. It is still a &#8216;work in progress&#8217; &#8211; currently I have been able to [...]]]></description>
			<content:encoded><![CDATA[<p><img src="http://semanticvoid.com/images/glimpse.png"/> I started working on my second weekend project, guess I&#8217;ll do something small every week. This one is an extension to <a href="http://semanticvoid.com/lifelogger">LifeLogger</a>. The aim is to analyze ones daily and weekly browsing history and extract themes which could aid in recommendations. It is still a &#8216;work in progress&#8217; &#8211; currently I have been able to generate the following visualizations:</p>
<p>The following visualization depicts the dominant keywords/topics for one day (the terms are stemmed):<br />
<script type="text/javascript" src="http://services.alphaworks.ibm.com/manyeyes/api/v1/snapshot/89ade5ae1b21b772011b3eec0aea0e56.js"></script><br />
I had been reading a couple of Yahoo! related articles and visualization blogs. This is captured by the above visualization &#8211; but there is still alot of noise which I need to get rid of.</p>
<p>The next visualization depicts the linkages and clusters for the keywords. There exists a link between two terms if they occur in the same document. [may take sometime to load - you'll need to zoom in to get a better look - click on 'compute layout' if the clusters don't show]<br />
<script type="text/javascript" src="http://services.alphaworks.ibm.com/manyeyes/api/v1/snapshot/89ade5ae1b21b772011b3efce2280e70.js"></script><br />
Both the above visualizations depict important metrics that could be used to extract dominant themes from the browsing history. Dominance should not be just inferred from frequency but also from the prevalent of a term across multiple pages. I still need to work on removing noise and running this on larger datasets like browsing history for a week or so. If you have any ideas or good papers to recommend that would be nice.</p>
]]></content:encoded>
			<wfw:commentRss>http://semanticvoid.com/blog/2008/07/19/glimpse-visualizing-your-browsing-history/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Particle &#8211; on the way to a Findory</title>
		<link>http://semanticvoid.com/blog/2008/07/11/particle-on-the-way-to-a-findory/</link>
		<comments>http://semanticvoid.com/blog/2008/07/11/particle-on-the-way-to-a-findory/#comments</comments>
		<pubDate>Sat, 12 Jul 2008 04:52:32 +0000</pubDate>
		<dc:creator>Anand Kishore</dc:creator>
				<category><![CDATA[Artificial Intelligence]]></category>
		<category><![CDATA[Hacking]]></category>
		<category><![CDATA[Project]]></category>
		<category><![CDATA[Tagging]]></category>

		<guid isPermaLink="false">http://semanticvoid.com/blog/2008/07/11/particle-on-the-way-to-a-findory/</guid>
		<description><![CDATA[Although I started this project as an experimental weekend thingy (to play around with Google App Engine), the project has shaped up well. Before you surf over to another blog, wondering what the hell I&#8217;m talking about, let me introduce you to &#8220;Personalized ARTICLE&#8221; aggregator (read as PARTICLE). The aim is to personalize a users [...]]]></description>
			<content:encoded><![CDATA[<p><center><a href="http://particle.semanticvoid.com"><img src="http://yucki.appspot.com/images/particle_logo.png" border=0/></a></center><br />
Although I started this project as an experimental weekend thingy (to play around with Google App Engine), the project has shaped up well. Before you surf over to another blog, wondering what the hell I&#8217;m talking about, let me introduce you to &#8220;<a href="http://particle.semanticvoid.com"><b>P</b>ersonalized <b>ARTICLE</b></a>&#8221; aggregator (read as PARTICLE). The aim is to personalize a users online reading (just like what Findory did). Findory was an excellent service and I&#8217;ll be glad if I can achieve even an iota of what Greg created. This project is at very rudimetary and experimental stage. Rather than tapping into the users reading history on the site (monitored by the links clicked), the idea is to study how a users <b>*interests*</b>, scattered around at various &#8220;databases of interest&#8221; like del.icio.us, could be used to personalize online reading (news articles, blogs and more). This way the user could merrily browse the world wide web, bookmarking pages, doing his usual stuff and let PARTICLE worry about making this data useful.</p>
<p><a href="http://particle.semanticvoid.com"><b>Click here to try PARTICLE</b></a></p>
<p>Presently you need to provide PARTICLE with your del.icio.us username, which it uses to analyze your <b>*interests*</b> and present you with recent news stories you may like. It works well if you have a decent number of bookmarks in del.icio.us. As I mentioned, the project is at a very rudimentary stage, so don&#8217;t feel disappointed by the results (ah! the unlucky few). I encourage you to play around with the app and recommend it to others to try. I&#8217;ll be making many changes/additions in the coming weeks.</p>
<p>Test drive PARTICLE at <a href="http://particle.semanticvoid.com">http://particle.semanticvoid.com</a>. Kindly leave your feedback/comments/suggestions in the comments or send me an email at &#8216;anand at semanticvoid.com&#8217;.</p>
<p><b>[UPDATE]</b> Yahoo! Research has a similar project called <a href="http://garcon.sandbox.yahoo.net/index.php">Garçon</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://semanticvoid.com/blog/2008/07/11/particle-on-the-way-to-a-findory/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Digital Immortality</title>
		<link>http://semanticvoid.com/blog/2008/02/19/digital-immortality/</link>
		<comments>http://semanticvoid.com/blog/2008/02/19/digital-immortality/#comments</comments>
		<pubDate>Wed, 20 Feb 2008 04:23:54 +0000</pubDate>
		<dc:creator>Anand Kishore</dc:creator>
				<category><![CDATA[Interview]]></category>
		<category><![CDATA[Machine Learning]]></category>
		<category><![CDATA[Project]]></category>

		<guid isPermaLink="false">http://semanticvoid.com/blog/2008/02/19/digital-immortality/</guid>
		<description><![CDATA[Gordon Bell explains MyLifeBits in this article. A good read for those who still don&#8217;t know about the MyLifeBits project. Gordon Bell and the Sense Cam MyLifeBits is a memory surrogate. It&#8217;s digital immortality. It&#8217;s a database or transaction processing system to capture everything in your life, every keystroke, every mouse click. Basically I&#8217;m capturing [...]]]></description>
			<content:encoded><![CDATA[<p>Gordon Bell explains MyLifeBits in <a target="_blank" href="http://66.35.240.8/cgi-bin/article.cgi?f=/c/a/2008/02/17/CMVJU2EKV.DTL">this article</a>. A good read for those who still don&#8217;t know about the MyLifeBits project.</p>
<p align="center"><img src="http://semanticvoid.com/images/gordon_bell.jpg" /></p>
<p align="center"><em>Gordon Bell and the Sense Cam</em></p>
<blockquote><p><span class="georgia md" id="bodytext"><strong>MyLifeBits</strong> is a memory surrogate. It&#8217;s digital immortality. It&#8217;s a database or transaction processing system to capture everything in your life, every keystroke, every mouse click. Basically I&#8217;m capturing all the minutiae of life.</span></p></blockquote>
<p><span class="georgia md" id="bodytext"> Now that you know about MyLifeBits, you may also want to explore <a target="_blank" href="http://semanticvoid.com/lifelogger">LifeLogger</a>, my MyLifeBits inspired project.</span></p>
<p>[Update] You might also be interested in <a target="_blank" href="http://www.nextgendesigncomp.com/entrydetail.aspx?id=953">Momenta</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://semanticvoid.com/blog/2008/02/19/digital-immortality/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>LifeLogger: Now At Koders</title>
		<link>http://semanticvoid.com/blog/2007/07/13/lifelogger-now-at-koders/</link>
		<comments>http://semanticvoid.com/blog/2007/07/13/lifelogger-now-at-koders/#comments</comments>
		<pubDate>Fri, 13 Jul 2007 04:43:43 +0000</pubDate>
		<dc:creator>Anand Kishore</dc:creator>
				<category><![CDATA[LifeLogger]]></category>
		<category><![CDATA[Project]]></category>

		<guid isPermaLink="false">http://semanticvoid.com/blog/2007/07/13/lifelogger-now-at-koders/</guid>
		<description><![CDATA[I received an email from Koders, a few hours back, saying that LifeLogger had finally been approved and added to their index. This is a great step for LifeLogger as it joins other numerous open source projects on Koders. This way the project would gain more visibility and contribute to code reusability as well. I [...]]]></description>
			<content:encoded><![CDATA[<p>I received an email from Koders, a few hours back, saying that <a target="_blank" title="LifeLogger - Homepage" href="http://semanticvoid.com/lifelogger">LifeLogger</a> had finally been approved and added to their index. This is a great step for LifeLogger as it joins other numerous open source projects on Koders. This way the project would gain more visibility and contribute to code reusability as well.</p>
<p>I still need to make a dozen check-ins from the code changes/feature implementations I made for the BarCamp last week. So you&#8217;ll still find old code at Koders and the svn. I&#8217;ll try to clear up the pending check-ins by next week.</p>
<p>I found some cool stats on the <a title="LifeLogger at Koders" target="_blank" href="http://www.koders.com/info.aspx?c=ProjectInfo&#038;pid=53YUEY14244PBB2VZDEK2X6W1F">Koders page</a>:</p>
<blockquote>
<div style="font-weight: bold"><span class="chart_title">Development Cost:</span></div>
<table border="0">
<tr>
<td valign="bottom" align="center" style="height: 22px" colspan="2">
<div class="cTotal">$7,715</div>
</td>
</tr>
<tr>
<td valign="top"><strong>Assumptions</strong></td>
<td valign="top"></td>
</tr>
<tr>
<td valign="top">Lines of code:</td>
<td valign="top">1,543</td>
</tr>
<tr>
<td valign="top" style="white-space: nowrap">Person months (PM):</td>
<td valign="top">1.54</td>
</tr>
<tr>
<td valign="top">Functions required:</td>
<td>100%</td>
</tr>
<tr>
<td valign="top">Effort per KLOC:</td>
<td>1.00  <a title="Person-Months"><u>PM</u></a></td>
</tr>
<tr>
<td valign="top">Labor Cost/Month:</td>
<td>$5000</td>
</tr>
</table>
</blockquote>
<p><a title=">> Click here to head over LifeLogger at Koders < <" target="_blank" href="http://www.koders.com/info.aspx?c=ProjectInfo&#038;pid=53YUEY14244PBB2VZDEK2X6W1F">>> Click here to head over LifeLogger at Koders < <</a></p>
<p>Update: I had to publish this post again using WordPress as Google Docs did not use the document name as the title of the blog post.<br />
Note: This post was written and published using Google Docs. I&#8217;m testing out Google Docs blog integration, so if the display is all messed up blame them :-) .</a></p>
]]></content:encoded>
			<wfw:commentRss>http://semanticvoid.com/blog/2007/07/13/lifelogger-now-at-koders/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

<!-- Dynamic Page Served (once) in 0.486 seconds -->

