<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>冒号空间 &#187; gmail</title>
	<atom:link href="http://blog.zhenghui.org/tag/gmail/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.zhenghui.org</link>
	<description>自然、人类、机器</description>
	<lastBuildDate>Fri, 16 Jul 2010 18:33:48 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>《关于信息系统组织方式的一个提案》的评论与反评</title>
		<link>http://blog.zhenghui.org/2009/08/20/a-proposal-on-organization-of-information-system-review/</link>
		<comments>http://blog.zhenghui.org/2009/08/20/a-proposal-on-organization-of-information-system-review/#comments</comments>
		<pubDate>Thu, 20 Aug 2009 04:23:13 +0000</pubDate>
		<dc:creator>hui</dc:creator>
				<category><![CDATA[信息管理]]></category>
		<category><![CDATA[gmail]]></category>

		<guid isPermaLink="false">http://blog.zhenghui.org/?p=313</guid>
		<description><![CDATA[评论网友Plusy对《关于信息系统组织方式的一个提案》的评论]]></description>
			<content:encoded><![CDATA[<h2><span style="font-style: normal; font-family: 宋体">《关于信息系统组织方式的一个提案》的评论与反评</span></h2>
<strong><span style="font-family: 宋体">网友</span>Plusy</strong><strong><span style="font-family: 宋体">的评论</span></strong>

re: <span style="font-family: 宋体">关于信息系统组织方式的一个提案</span> 2008-05-20 02:04 plusy
<p style="text-align: justify"><span style="font-family: 宋体">首先感谢你分享你的想法。</span></p>
<p style="text-align: justify"><span style="font-family: 宋体">这里我想补充一些我个人对</span>gmail<span style="font-family: 宋体">标签系统的理解。</span></p>
<p style="text-align: justify">gmail <span style="font-family: 宋体">的标签系统，个人感觉像一个列表（</span>List)<span style="font-family: 宋体">，如果不考虑</span>thread<span style="font-family: 宋体">和时间排序的因素，更像一个字典，标签是</span>key,<span style="font-family: 宋体">而邮件是</span>values. <span style="font-family: 宋体">如果引入权重，则更像队列</span>(Queue), <span style="font-family: 宋体">如果引入树状层级，则相当于重新构建了一个文件系统结构，如果引入图结构，则可以构成复杂连接。从思维的角度来说，标签是给原始的信息标上了索引，即加上了语义，标签链接关系是另一层的语义。权重、父子和多维关联是队列、树和图所表达的基本语义。这里的关键是要让语义来组织信息。</span></p>
<p style="text-align: justify"><span style="font-family: 宋体">访问频率作为权重、“主标签”作为“相关度”和线信作为聚合引擎，这三种方法都是基于对用户行为的跟踪得来的，可以自动执行，例如</span>gmail<span style="font-family: 宋体">的</span>filter<span style="font-family: 宋体">。但标签之间的有向关联，别名和文件夹命名则需要用户的干预，机器无法精确理解。比较好的可能是集成人工干预，例如标签的导航系统，内容分析系统，甚至搜索系统，这些都需持续的行为观察和记忆。以上是我对楼主</span>proposal<span style="font-family: 宋体">从语义和语法角度的理解。</span></p>
<p style="text-align: justify"><span style="font-family: 宋体">另外，如果单纯使用语法层面的标签系统，对邮件系统而言，可能有一些困难，以下是我自己遇到的一些问题，供你在设计的时候参考：</span></p>
<p style="text-align: justify"><span style="font-family: 宋体">（</span>1<span style="font-family: 宋体">）标签可能会出现错别字，会导致基于文本比较的关联失败。例如会出现多个别名，”经管“，”尽管“等其实都是想表达“经济与管理”，但用户的疏忽会导致需要一个容错机制，或一个异常的解决方式</span></p>
<p style="text-align: justify"><span style="font-family: 宋体">（</span>2<span style="font-family: 宋体">）维护大量的标签所带来的麻烦是否会抵消它所带来的好处。我们使用文件系统屏蔽了直接维护</span>inode<span style="font-family: 宋体">的不便，现在我们用标签来屏蔽文件树的不便。标签所带来的扁平化的好处，可能会图、树的复杂性所消耗，从而带来新的维护负担。例如我自己在</span>gmail<span style="font-family: 宋体">中使用了有前缀的标签（使用字母顺表达优先级，共同前缀表达树状关联），但如果标签太多，标签列表就会太长而没办法在一屏显示。</span></p>
<p style="text-align: justify"><span style="font-family: 宋体">（</span>3<span style="font-family: 宋体">）别名机制的冲突问题。这个你在</span>proposal<span style="font-family: 宋体">中已经提到了，如果关注度是通过文本方式</span>(<span style="font-family: 宋体">搜索和排序）来提取的，则可能会导致自递归循环，实现上比较麻烦。我猜想</span>gmail<span style="font-family: 宋体">的</span>filter<span style="font-family: 宋体">中无法使用另一个</span>filter<span style="font-family: 宋体">大概是为了避免这个问题。</span></p>
<p style="text-align: justify"><span style="font-family: 宋体">不管我的理解是否贴切，以及几个特例是否有价值，都希望能早日用到你所设想的标签系统。</span></p>
<p style="text-align: justify"><span style="font-family: 宋体">最后感谢你的</span>proposal<span style="font-family: 宋体">再次激发了我自己对</span>gmail<span style="font-family: 宋体">标签系统的思考。</span></p>
<p style="text-align: justify"><strong><span style="font-family: 宋体">我的反评</span></strong></p>
<p style="text-align: justify"><span style="font-family: 宋体">非常高兴能得到您极为专业的评论！由于成文匆忙，有些细节未能充分展开，旨在抛砖引玉。这不，您这块玉就给引出来了。下面请允许我对您的评论作一个反评论</span>:<span style="font-family: 宋体">－）</span></p>
<p style="text-align: justify">&gt;&gt;<em><span style="font-family: 宋体">标签是给原始的信息标上了索引，即加上了语义，标签链接关系是另一层的语义。权重、父子和多维关联是队列、树和图所表达的基本语义。这里的关键是要让语义来组织信息。</span></em></p>
<p style="text-align: justify"><span style="color: blue; font-family: 宋体">说得对极了！</span></p>
<p style="text-align: justify">&gt;&gt;<em><span style="font-family: 宋体">访问频率作为权重、“主标签”作为“相关度”和线信作为聚合引擎，这三种方法都是基于对用户行为的跟踪得来的，可以自动执行。</span></em></p>
<p style="text-align: justify"><span style="color: blue">1.</span><span style="color: blue; font-family: 宋体">访问频率基于用户行为，但用户可预先赋予不同的标签以不同的初始值；</span></p>
<p style="text-align: justify"><span style="color: blue">2.</span><span style="color: blue; font-family: 宋体">相关度大多需用户定义，机器难以识别，基于内容并不可靠，何况有些是</span><span style="color: blue">binary</span><span style="color: blue; font-family: 宋体">；</span></p>
<p style="text-align: justify"><span style="color: blue">3.gmail</span><span style="color: blue; font-family: 宋体">提供的</span><span style="color: blue">thread</span><span style="color: blue; font-family: 宋体">是基于</span><span style="color: blue">subject</span><span style="color: blue; font-family: 宋体">的，如果邮件改换</span><span style="color: blue">subject</span><span style="color: blue; font-family: 宋体">，则属于不同的</span><span style="color: blue">conversation</span><span style="color: blue; font-family: 宋体">。我们需要用户有自定义</span><span style="color: blue">thread</span><span style="color: blue; font-family: 宋体">的权力。此外，对非邮件的信息系统（如文件系统），</span><span style="color: blue">thread</span><span style="color: blue; font-family: 宋体">是难以由机器生成的。</span></p>
<p style="text-align: justify">&gt;&gt;<span style="font-family: 宋体">比较好的可能是集成人工干预，例如标签的导航系统，内容分析系统，甚至搜索系统，这些都需持续的行为观察和记忆。</span></p>
<p style="text-align: justify"><span style="color: blue; font-family: 宋体">非常正确！一个智能的系统应该对用户行为有一定的预判力，这离不开平时对用户行为的观察和记忆。</span></p>
<p style="text-align: justify">&gt;&gt;<em><span style="font-family: 宋体">标签可能会出现错别字，会导致基于文本比较的关联失败。用户的疏忽会导致需要一个容错机制，或一个异常的解决方式</span></em></p>
<p style="text-align: justify"><span style="color: blue; font-family: 宋体">说得没错。不妨与传统的树型结构比较：若用户通过鼠标点击，二者均无错别字问题；若通过文本，二者都可能出错。标签查询可类似文件路径支持通配符，此外若用户输入不存在的标签，可由机器生成一些可能的标签供用户选择。正如用户在</span><span style="color: blue">google</span><span style="color: blue; font-family: 宋体">中搜索时键入错别字，</span><span style="color: blue">google</span><span style="color: blue; font-family: 宋体">系统会提供一些可能的选择。</span></p>
<p style="text-align: justify">&gt;&gt;<em><span style="font-family: 宋体">维护大量的标签所带来的麻烦是否会抵消它所带来的好处。标签所带来的扁平化的好处，可能会被图、树的复杂性所消耗，从而带来新的维护负担。</span></em></p>
<p style="text-align: justify"><span style="color: blue; font-family: 宋体">这正是我想解决的问题。随着文档增多，标签不可避免地增加。如果只是控制标签数量，每个标签下的文档过多也达不到快速检索的目的。请注意该提案主要针对海量文档，如果引入少量的麻烦能解决大量的麻烦，那么这个努力是值得的。此外，该提案是向下兼容的，如果用户的文档不足够多，大可不必引入标签之间的有向关联以及标签的权重等，这就退化为</span><span style="color: blue">Gmail</span><span style="color: blue; font-family: 宋体">的标签系统了。就我个人经验而言，虽然</span><span style="color: blue">Gmail</span><span style="color: blue; font-family: 宋体">邮件并不太多，仍常常借助搜索内容而不是标签来检索。这是顾忌到</span><span style="color: blue">Gmail</span><span style="color: blue; font-family: 宋体">的标签只是一维列表，不愿引入过多标签致使列表过长。搜索内容并没有什么不好，但即使不考虑非文本内容的问题，仍有效率问题。比如，在文件系统中搜索含有某关键词的文件通常费时超过用户的容忍度。</span></p>
<p style="text-align: justify">&gt;&gt;<em><span style="font-family: 宋体">例如我自己在</span>gmail<span style="font-family: 宋体">中使用了有前缀的标签（使用字母顺表达优先级，共同前缀表达树状关联），但如果标签太多，标签列表就会太长而没办法在一屏显示。</span></em></p>
<p style="text-align: justify"><span style="color: blue; font-family: 宋体">如果标签不以列表而是层级结构来排列的话，正好可解决您的问题——具有相同前缀的标签可以有共同的父标签，可以同时展开或收拢从而节省标签结构的整体高度。</span></p>
<p style="text-align: justify">&gt;&gt;<em><span style="font-family: 宋体">别名机制的冲突问题。这个你在</span>proposal<span style="font-family: 宋体">中已经提到了，如果关注度是通过文本方式</span>(<span style="font-family: 宋体">搜索和排序</span>)<span style="font-family: 宋体">来提取的，则可能会导致自递归循环，实现上比较麻烦。我猜想</span>gmail<span style="font-family: 宋体">的</span>filter<span style="font-family: 宋体">中无法使用另一个</span>filter<span style="font-family: 宋体">大概是为了避免这个问题。</span></em></p>
<p style="text-align: justify"><span style="color: blue; font-family: 宋体">没有很明白您的意思。您指的是标签名（而不是别名）的冲突问题吧？其实标签名冲突不是真正的问题，如果冲突正说明它们应该合并，而这在传统的层级结构中是不可能的。如果想进一步区分，再贴另外的标签就是。</span></p>
<p style="text-align: justify"><span style="color: blue; font-family: 宋体">关于自递归循环的问题，我不能肯定您的意思。不过防止标签图出现单向回环是必要的。正如前述，本提案中关注度除访问频率外均由用户定义。另外，</span><span style="color: blue">Gmail</span><span style="color: blue; font-family: 宋体">的</span><span style="color: blue">filter</span><span style="color: blue; font-family: 宋体">虽不能组合使用，但标签可组合过滤。</span></p>
<p style="text-align: justify"><strong><span style="font-family: 宋体">系统界面设想</span></strong></p>
<p style="text-align: justify"><span style="font-family: 宋体">最后，简单设想一下提案中的系统界面。它类似</span>windows<span style="font-family: 宋体">文件浏览器（</span>explorer<span style="font-family: 宋体">），左边（只要靠边即可）是树状标签结构，点击任何标签右边将显示<em>所有包含该标签的信息条</em>。这与</span>explorer<span style="font-family: 宋体">有些不同：点击</span>explorer<span style="font-family: 宋体">的文件夹右边只显示<em>子文件夹</em>和<em>一级子文件</em>。右边的信息条可进一步按各种准则排序、过滤和搜索。这里暂时没有考虑一个标签有多个父标签的情形。此外，左边的标签除了</span>tree view<span style="font-family: 宋体">外，还有</span>list view<span style="font-family: 宋体">，正如</span>Gmail<span style="font-family: 宋体">的标签列表，但可按重要性、紧急性、常用性等排序。至于别名和</span>thread<span style="font-family: 宋体">，可以分别理解为标签和信息条的聚合，用户可点击展开或收拢。</span></p>
<p style="text-align: justify"><strong>参考链接</strong></p>

<a id="homepage1_HomePageDays_DaysList_ctl00_DayItem_DayList_ctl01_TitleUrl" href="http://blog.zhenghui.org/2009/08/18/a-proposal-on-organization-of-information-system-cn/">关于信息系统组织方式的一个提案</a>

<a id="homepage1_HomePageDays_DaysList_ctl01_DayItem_DayList_ctl00_TitleUrl" href="http://blog.zhenghui.org/2009/08/17/a-proposal-on-organization-of-information-system/">A Proposal on Organization of Information System</a><a class="a2a_dd addtoany_share_save" href="http://www.addtoany.com/share_save?linkurl=http%3A%2F%2Fblog.zhenghui.org%2F2009%2F08%2F20%2Fa-proposal-on-organization-of-information-system-review%2F&amp;linkname=%E3%80%8A%E5%85%B3%E4%BA%8E%E4%BF%A1%E6%81%AF%E7%B3%BB%E7%BB%9F%E7%BB%84%E7%BB%87%E6%96%B9%E5%BC%8F%E7%9A%84%E4%B8%80%E4%B8%AA%E6%8F%90%E6%A1%88%E3%80%8B%E7%9A%84%E8%AF%84%E8%AE%BA%E4%B8%8E%E5%8F%8D%E8%AF%84">分享/保存</a>]]></content:encoded>
			<wfw:commentRss>http://blog.zhenghui.org/2009/08/20/a-proposal-on-organization-of-information-system-review/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>关于信息系统组织方式的一个提案</title>
		<link>http://blog.zhenghui.org/2009/08/18/a-proposal-on-organization-of-information-system-cn/</link>
		<comments>http://blog.zhenghui.org/2009/08/18/a-proposal-on-organization-of-information-system-cn/#comments</comments>
		<pubDate>Tue, 18 Aug 2009 02:05:21 +0000</pubDate>
		<dc:creator>hui</dc:creator>
				<category><![CDATA[信息管理]]></category>
		<category><![CDATA[gmail]]></category>

		<guid isPermaLink="false">http://blog.zhenghui.org/?p=17</guid>
		<description><![CDATA[昨日整理Gmail信箱之时，觉有不便之处，于是进入“Suggest a feature for Gmail”的页面，准备提些建议。不意一只灵感的小虫悄悄爬上脑梢，急欲捕之而后快。遂作“A Proposal on Organization of Information System”一文，以备Gmail参考之用。甘冒不谦之嫌，窃以为该提案是对包括文件系统、邮件系统等在内的信息系统的组织方式的一种创新。。。]]></description>
			<content:encoded><![CDATA[<h2 style="text-align: justify;"><span style="font-family: 宋体; font-style: normal;">关于信息系统组织方式的一个提案</span></h2>
<span style="font-family: 宋体;">序言</span>
<p style="text-align: justify;"><span style="font-family: 宋体;">昨日整理</span>Gmail<span style="font-family: 宋体;">信箱之时，觉有不便之处，于是进入“</span>Suggest a feature for Gmail<span style="font-family: 宋体;">”的页面，准备提些建议。不意一只灵感的小虫悄悄爬上脑梢，急欲捕之而后快。遂作“</span><a href="http://blog.zhenghui.org/2009/08/17/a-proposal-on-organization-of-information-system/"><span style="color: blue; text-decoration: none;">A Proposal on Organization of Information System</span></a><span style="font-family: 宋体;">”一文，以备</span>Gmail<span style="font-family: 宋体;">参考之用。甘冒不谦之嫌，窃以为该提案是对包括文件系统、邮件系统等在内的信息系统的组织方式的一种创新。为让更多的国内同仁了解，现将此文译成中文。仓促成文，还望诸位方家不吝赐教。</span></p>
<p style="text-align: right;" align="right"><span style="font-family: 宋体;">郑晖于</span>2008<span style="font-family: 宋体;">年</span>5<span style="font-family: 宋体;">月</span>19<span style="font-family: 宋体;">日</span></p>
<p style="text-align: justify;">1. <em><span style="font-family: 宋体;">引言</span></em></p>
<p style="text-align: justify;"><span style="font-family: 宋体;">我们生活在一个信息 时代，但有时信息带来的负担甚至超过收益。从用户的角度看，大多数信息系统包括文件系统、邮件系统和各类基于菜单的系统本质上都是层级 （hierarchical）结构的。随着信息量的递增，系统的可用性却在递减。这种结构的主要缺陷是它仅提供了通往目标信息的<em>单一通道</em>。用户在任何一个转角处走错都可能导致最终迷路。如果一个信息系统能支持<em>多路通道</em>，情况就会得到改善。有鉴于此，本文借助</span>Gmail<span style="font-family: 宋体;">系统的一些思想提出了一个切实可行的方案。</span></p>
<p style="text-align: justify;">2.<em><span style="font-family: 宋体;">信息检索之困</span></em></p>
<p style="text-align: justify;"><span style="font-family: 宋体;">信息是个好东西，可储存和重新获取却令人头痛。日复一日地，一个典型的计算机用户浏览并保存网页，收集心仪的书签和</span>RSS<span style="font-family: 宋体;">，从</span>BT<span style="font-family: 宋体;">或</span>emule<span style="font-family: 宋体;">上 下载文件，收发电子邮件，编写文档或程序。他愉悦地享受着这一切，直到有一天他发现自己逐渐为信息超载所困扰。一个明显的迹象是他时不时感到有点头晕—— 他的桌面凌乱不堪，各种图标如沙丁鱼般“济济一堂”；他的书签菜单展开来如巨毯般一直拖到地上；他的信箱塞满邮件，鼓鼓囊囊、几欲暴裂。他开始意识到如果 这种状况不改变，他的脑袋一定比硬盘或邮箱更早爆炸。此后，他养成了将文件、书签和邮件整理到层级文件夹中的习惯。情况果然大为改观。惜乎好景不长，文档 数量增长迅猛，文件夹越来越多、越来越深。将一个文档保存到合适的地方需要花费时间，而找回当初下载或创建的文档则更花时间。整日在树状结构中穿梭，他有 些倦恼和迷失了。他知道自己拥有一棵遮天蔽日的圣诞树，上面挂满了琳琅满目的礼物，可是没有多少是触手可及的。每每在掘地三尺仍一无所获之后，他不得不怀 疑自己的记忆，偶尔也忍不住怀疑机器的记忆。明知那些失踪之物从来不会自动跳出来，他还是情不自禁地冲着电脑歇斯底里：那些该死的文档到底躲到哪里去了？ 时不时地，他又滑回老习惯：将所有最新的文件保存到桌面，不为别的，只是那里似乎更方便更令人放心。我们不禁要问：这种困境的根源是什么？</span></p>
<p style="text-align: justify;">3. <em>Gmail</em><em><span style="font-family: 宋体;">解决方案</span></em></p>
<p style="text-align: justify;"><span style="font-family: 宋体;">问题出在传统的信息组织方式上，即树（或森林）型结构。这种层级结构应付大量信息尚胜任有余，但对于海量信息则有些不堪重负。随着信息量的膨胀，树型结构越来越力不从心。许多文件夹中的列表不可避免地变长，一些文件夹被深层嵌套。在文件系统中，通过在</span>Windows<span style="font-family: 宋体;">中创建捷径或在</span>Unix<span style="font-family: 宋体;">类的操作系统中创建符号链接（</span>symbolic link<span style="font-family: 宋体;">）能一定程度上缓解一些症状，但显然不能根治。作为一种有趣的替代方案，</span>Google<span style="font-family: 宋体;">的</span>Gmail<span style="font-family: 宋体;">提供了他们称作“标签”（</span>label<span style="font-family: 宋体;">） 的工具。一个标签是一种文字标记，它能与其他的标签同时应用到一条信息上。开始许多用户抱怨它，因为他们习惯了文件夹风格。但这种抱怨慢慢减少，用户发现 他们的信息不再是藏于密密丛林的游击队，而是一字排开等待检阅的正规军。所有最近的信息都在顶部，而这在精心组织的文件夹系统中是不可能的。用户不再为如 何分类信息而犯难，他们可以在每条信息上贴上任意多的标签。找一个特定的信息也很容易，既可用自定义标签来过滤，也可用系统标签如</span>inbox, sent, star, chat, trash<span style="font-family: 宋体;">等来过滤。他们还能通过收信人、发信人、主题和信息内容来搜索。更好的是，用户可定义过滤器自动为来信贴标签。</span><span style="font-family: 宋体;">这种解决方案，今后我们称为<em>标签结构</em>，不必囿于邮件管理系统，它能有效地用于文件系统和其他诸如知识管理系统之类的信息系统。</span></p>
<p style="text-align: justify;">4. <em><span style="font-family: 宋体;">改进方案</span></em></p>
<p style="text-align: justify;"><span style="font-family: 宋体;">标签结构并非尽善尽美。尽管与信息数量比，标签要少得多，但依然会泛滥。在</span>Gmail<span style="font-family: 宋体;">的 标签结构中，所有用户定义的标签是独立而平等的，但事实上——不同的标签在重要性、紧急性和常用性上可能大相径庭；一些标签有着内在联系；同一信息上的不 同标签在相关度上也有所不同。比如，“工作”或“家庭”的标签更重要，“待做”或“考试”的标签更紧急，“体育”或“电影”的标签对一个体育迷或电影迷来 说更常用。当一个程序员将一些资料标记为“</span>Java<span style="font-family: 宋体;">”或“</span>C++<span style="font-family: 宋体;">”后，他很希望它们能自动加上“程序语言”和“</span>OOP<span style="font-family: 宋体;">”的标签，以便今后它们能出现在一个列表中。最后，一些标签可能比另外的标签更能描述一条信息。综合以上考虑，我们提出如下可行方案。</span></p>

<ul style="margin-top: 0cm;" type="disc">
	<li style="text-align: justify;"><strong><span style="font-family: 宋体;">在标签结构中引入层级结构。</span></strong><span style="font-family: 宋体;">我们将标签视作信息的元数据，并将它们以传统的树型结构来组织。这样我们将两个世界最好的部分结合起来，取长补短。实际上我们可以走得更远。我们知道，层级树型结构在图论中是<em>有向树</em>，只要有意义，我们可以把标签结构推广为<em>有向图</em>（</span>digraph<span style="font-family: 宋体;">）。这意味着一个标签可以有多个上级，有点类似一些</span>OOP<span style="font-family: 宋体;">语言中的多继承。显然当所有的标签都是树根（即无子标签）时，就退化为</span>Gmail<span style="font-family: 宋体;">的标签结构</span><span style="font-family: 宋体;">。</span></li>
	<li style="text-align: justify;"><strong><span style="font-family: 宋体;">为标签引入重要性、紧急性和常用性权重，标签可按权重排序。</span></strong>Gmail<span style="font-family: 宋体;">的星号标签可作此用，但粒度过粗。常用性权重可在每次访问后自动增值，这样最常用的标签总在前面。标签还能按最近访问时间来排序。如是，用户最关心的信息抬眼即是、垂手可得。</span></li>
	<li style="text-align: justify;"><strong><span style="font-family: 宋体;">引入主标签。</span></strong><span style="font-family: 宋体;">一项信息的某个标签可设为主标签。从这种意义上讲，传统的树型结构是我们这种结构的特例：每个文件夹名正是一个标签名。（但有一个细微差别：同样的文件夹名在不同的路径下不会象标签名那样发生冲突）如果主标签的相关度是</span>1<span style="font-family: 宋体;">，那么其他标签的相关度应在</span>0<span style="font-family: 宋体;">和</span>1<span style="font-family: 宋体;">之间，这为搜索和排序提供了新的准则。</span></li>
	<li style="text-align: justify;"><strong><span style="font-family: 宋体;">引入别名标签。</span></strong><span style="font-family: 宋体;">标签允许有多个名字，这些名字可以是同义词、缩写甚至是不同的语种。别名还能更强大：用户可一个标签定义为其他标签的逻辑组合。例如，“我的程序”可定义为“我的文档</span>and<span style="font-family: 宋体;">程序”，“娱乐”可定义为“体育</span>or<span style="font-family: 宋体;">小说</span>or<span style="font-family: 宋体;">电影”等等。</span></li>
	<li style="text-align: justify;"><strong><span style="font-family: 宋体;">引入线信（</span>thread</strong><strong><span style="font-family: 宋体;">）。</span></strong><span style="font-family: 宋体;">用户能建立</span>thread<span style="font-family: 宋体;">将相关信息连接起来。</span>Gmail<span style="font-family: 宋体;">中有会话（</span>conversation<span style="font-family: 宋体;">），但用户无法自己合并相关邮件。</span>thread <span style="font-family: 宋体;">对信息跟踪和保留不同版本的信息非常有用，这种聚合使得信息系统更加紧凑连贯。</span></li>
</ul>
<p style="text-align: justify;">5. <em><span style="font-family: 宋体;">结论</span></em></p>
<p style="text-align: justify;"><span style="font-family: 宋体;">要定位一项信息，用户在层级系统中需要点击文件夹在展开，在标签系统中需要点击标签来过滤。我们没有提及搜索是因为搜索较慢且有些信息不以文本形式存在。标签系统是更好的解决方案，但仍有不足之处。为了进一步方便信息检索，我们设计了<em>含权有向图标签结构</em>（</span>weighted diagraph tag structure<span style="font-family: 宋体;">），这是一种结合树型结构的优点的标签结构。一个具此结构的信息系统应该更加平易近人且令人愉快，它的用户可以象悠闲的养鱼人，不管往池塘里投入多少条鱼，只要一声口哨，他想要的那条就会摇头摆尾地游过来。</span></p><a class="a2a_dd addtoany_share_save" href="http://www.addtoany.com/share_save?linkurl=http%3A%2F%2Fblog.zhenghui.org%2F2009%2F08%2F18%2Fa-proposal-on-organization-of-information-system-cn%2F&amp;linkname=%E5%85%B3%E4%BA%8E%E4%BF%A1%E6%81%AF%E7%B3%BB%E7%BB%9F%E7%BB%84%E7%BB%87%E6%96%B9%E5%BC%8F%E7%9A%84%E4%B8%80%E4%B8%AA%E6%8F%90%E6%A1%88">分享/保存</a>]]></content:encoded>
			<wfw:commentRss>http://blog.zhenghui.org/2009/08/18/a-proposal-on-organization-of-information-system-cn/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>A Proposal on Organization of Information System</title>
		<link>http://blog.zhenghui.org/2009/08/17/a-proposal-on-organization-of-information-system/</link>
		<comments>http://blog.zhenghui.org/2009/08/17/a-proposal-on-organization-of-information-system/#comments</comments>
		<pubDate>Mon, 17 Aug 2009 10:50:13 +0000</pubDate>
		<dc:creator>hui</dc:creator>
				<category><![CDATA[信息管理]]></category>
		<category><![CDATA[gmail]]></category>
		<category><![CDATA[英文文章]]></category>

		<guid isPermaLink="false">http://blog.zhenghui.org/?p=12</guid>
		<description><![CDATA[<strong>关于信息系统组织方式的一个改进方案</strong>
Information itself is great, but storing and retrieving information sucks.
From a user’s view, most information systems are essentially organized in hierarchical structure. The major flaw of this kind of structure is that it only provides a single path to the target information. Aiming at this, this article proposes a practical solution by borrowing some ideas from Gmail system.]]></description>
			<content:encoded><![CDATA[<h2 style="text-align: justify;"><span style="font-style: normal;">A Proposal on Organization of Information System （另见：<a href="http://blog.zhenghui.org/2009/08/18/a-proposal-on-organization-of-information-system-cn/">中文版</a>）
</span></h2>
<p style="text-align: right;" align="right">Authored by Hui Zheng on May 18, 2008</p>
<p style="text-align: justify;">1. <em>Introduction</em></p>
<p style="text-align: justify;">We are living in an age of information, but sometimes information imposes more burden than benefit. From a user’s view, most information systems including file systems, mail systems and various menu-based systems are essentially organized in hierarchical structure. As information increases, yet the usability of the sytems decreases. The major flaw of this kind of structure is that it only provides a <em>single path</em> to the target information. If a user misses one corner, he will possibly lose his way. This situation could be improved if an information system supports <em>multipath routing</em>. Aiming at this, this article proposes a practical solution by borrowing some ideas from Gmail system.</p>
<p style="text-align: justify;">2.<em> Information retrieval problem</em></p>
<p style="text-align: justify;">Information itself is great, but storing and retrieving information sucks. Day by day, a typical computer user browses and saves web pages, collects favorite bookmarks and RSS, downloads files from BT or emule, composes and receives email, writes documents or programs etc. He enjoys all of these until someday, he finds himself gradually suffering information overload. As an evidence, he now and then feels a little bit dizzy: his desktop is terribly messy with miscellaneous icons packed like sardines, his bookmark menu pulls all the way down like a huge blanket and his inbox is cluttered with mails like a bulging bag. He comes to realize that if this situation cannot be changed, his brains must explode before his hard disk or mailbox does. Thereafter, He cultivates the habit of organizing files, bookmarks and mails into hierarchical folders. As a result, things improve a lot. Unfortunately, good times don&#8217;t last long. He finds that as his documents grow rapidly, his folders become more and more, deeper and deeper. It takes some time for him to save a document to appropriate position, and it does even more time to find a document he downloads or composes. He tends to get tired, vexed and somewhat lost when he navigates the hierarchical trees. He knows he possesses a gigantic Christmas tree with tremendous gifts hanging on, but few of them are handy. Time and time again, he fails to find an important document after exhaustive search. He doubts his memory, and occasionally, he doubts machine&#8217;s memory. Although he knows those missing stuffs will never automatically jump upon to him, he still cannot help yelling at the machine: where the hell are the damn documents hiding? From time to time, he slips back to the old habit: saving all recent files to desktop, just for better convenience and confidence. So, what is the root of the plague?</p>
<p style="text-align: justify;">3. <em>Gmail&#8217;s solution</em></p>
<p style="text-align: justify;">It turns out that the evil root is the traditional manner of information organization, namely, tree (or forest) structure. This hierarchical structure is reasonable for large but not huge quantities of information items. As the information volume swells, the tree structure becomes unmanageable little by little. The item lists in many folders are inevitably long and some folders are deeply nested. In file systems, this symptom can be alleviated to some degree by creating shortcuts in Windows or symbolic links in Unix-family OS. But that is not a final cure. As an interesting alternative, Google&#8217;s Gmail presents what they call &#8220;label&#8221;. A label is basically a tag which could be applied to a message together with other ones. Many users complain about it at the beginning because they are used to old folder fashion. But the complaints are waning as the time passes by. The users find that their messages are no longer like guerilla hiding in deep forest, instead, they are like regular army in one-line arrangement waiting for inspection. The most recent messages are on the top to access, which is impossible in the well-organized folder system; they won&#8217;t be bewildered where to sort the messages since they can apply as many labels as they&#8217;d like to the messages; finding a specific message is easy too: users can filter by user-defined labels, or system-defined labels like inbox, sent, star, chat, trash etc. They can also search by sender, receiver, subject and message content. Even better, users can define filters that automatically apply labels to the incoming mails. This solution, henceforth we call <em>tag structure</em>, is not necessarily limited to mail management system, it should apply well to file system and other information systems such as knowledge management system.</p>
<p style="text-align: justify;">4. <em>Our solution</em></p>
<p style="text-align: justify;">However, tag structure doesn&#8217;t always suffice for our needs. Even though tags are much fewer than information items, they still can overflow. In Gmail&#8217;s tag structure, all user-defined labels are independent and equal, but as a matter of fact, they are very likely different in their importance, urgency, popularity; some labels have inherent relation; the labels for a given information item vary in correlation. For example, labels like &#8220;work&#8221; or &#8220;family&#8221; are more important; labels like &#8220;todo&#8221; or &#8220;exam&#8221; are more urgent; labels like &#8220;sports&#8221; or “film” are more popular if the user is a sports or film fan. It&#8217;s also desirable that after a programmer user labels some materials as &#8220;Java&#8221; or &#8220;C++&#8221;, those materials can be automatically labeled as &#8220;programming language&#8221; and “OOP” such that he can later get all programming language-related items or OOP items in one list. Lastly, among all labels for a given information item, one may be more correlative than the others. Taking all of these into consideration, we propose a feasible solution as follows.</p>

<ul style="margin-top: 0cm;" type="disc">
	<li style="text-align: justify;"><strong>Introduce hierarchical     structure into tag structure</strong>. That is, we treat tags/labels as     metadata of information, and organize them in the traditional tree     structure. This way we combine two worlds&#8217; best parts together. Actually,     we can go further. As we know, hierarchical tree structure is a <em>directed     tree</em> in graph theory, but we may generalize the tag structure to <em>digraph</em> as long as it makes sense. This will allow a tag have more than one     parents, something like multiple inheritance in some OOP languages.     Obviously the gmail&#8217;s tag structure is a special case of our structure     when all labels are the roots(i.e. those having no sublabels).</li>
	<li style="text-align: justify;"><strong>Introduce weight of     importance, emergency and popularity</strong> for each tag so that tags are     sortable by any of these respects. Gmail&#8217;s star label can serve as this     purpose, but it&#8217;s too coarse-grained. The popularity weight of a tag can     be chosen to be auto-incremented by each visit of the tag, which ensures     the most frequently used tags are always on the top. Besides, tags can be     sorted by most recent visited time. Consequently, users will have more     confidence that documents they really care are available to fetch, and     accessing any interesting, active and important items in the system is     just a piece of cake.</li>
	<li style="text-align: justify;"><strong>Introduce main label</strong>,     i.e. one of an item&#8217;s labels can be specified as the main label. In this     sense, the traditional tree structure can be viewed as a special case of     our structure: any folder name is exactly a label name (There is one     subtle difference, though: unlike label names, same folder names in     different path wouldn&#8217;t clash). If the main label&#8217;s correlativity is 1,     other labels&#8217; should be between 0 and 1. This provides extra search and     sort criteria.</li>
	<li style="text-align: justify;"><strong>Introduce alias tag</strong>.     Tags are allowed to have more than one names, these names can be     abbreviations, synonyms, or even in different languages. Furthermore,     alias can be more powerful: users may define a label as the logical     combination of existing labels. For example, one can define     &#8220;myPrograms&#8221; as &#8220;&#8216;my documents&#8217; <em>and</em> &#8216;programs&#8217;&#8221;, define &#8220;entertainment&#8221; as &#8220;sports <em>or</em> novel <em>or</em> movie&#8221; etc.</li>
	<li style="text-align: justify;"><strong>Introduce thread</strong>.     Users can create thread that link related message items. Gmail has     conversation, but it doesn&#8217;t allow users to union mails by themselves. The     thread is good for follow-ups and different document revisions. This     aggregation makes information system more compact and coherent.</li>
</ul>
<p style="text-align: justify;">5. <em>Conclusion</em></p>
<p style="text-align: justify;">To locate an information item, users need click folders to expand in a hierarchical system, while they need click labels to filter in a tag system. We don&#8217;t mention search because search is slow and information content may be in binary form. The tag structure is conceivably a better solution, but it still has shortcomings. To further facilitate information retrieval, we&#8217;ve proposed a <em>weighted diagraph tag structure</em>, which is an improved tag structure integrating advantages of tree structure. An information system featuring this structure should be more accessible and enjoyable, and its users could be like pisciculturists, no matter how many fishes they have thrown into the ponds, any fish they desire will swim to them with waggly tail upon a single whistle.</p><a class="a2a_dd addtoany_share_save" href="http://www.addtoany.com/share_save?linkurl=http%3A%2F%2Fblog.zhenghui.org%2F2009%2F08%2F17%2Fa-proposal-on-organization-of-information-system%2F&amp;linkname=A%20Proposal%20on%20Organization%20of%20Information%20System">分享/保存</a>]]></content:encoded>
			<wfw:commentRss>http://blog.zhenghui.org/2009/08/17/a-proposal-on-organization-of-information-system/feed/</wfw:commentRss>
		<slash:comments>17</slash:comments>
		</item>
	</channel>
</rss>
