Musings on Technology: Riding the Semantic Wave

Hi Folks! Are you all enjoying net-surfing ?

Oh man! Don't ask me ! I am sick of repeating my personal profile every time I register in any social site! Why can't it be tagged and semantic data be shared with all other sites ?

Well its already happening :-) may be you are not yet riding the semantic wave !

Just get blessed by the God of Internet (Tim berners Lee)

"So the Net and the Web may both be shaped as something mathematicians call a Graph, but they are at different levels. The Net links computers, the Web links documents. Now, people are making another mental move. There is realization now, "It's not the documents, it is the things they are about which are important". Obvious, really.

Biologists are interested in proteins, drugs, genes. Businesspeople are interested in customers, products, sales. We are all interested in friends, family, colleagues, and acquaintances. There is a lot of blogging about the strain, and total frustration that, while you have a set of friends, the Web is providing you with separate documents about your friends. One in facebook, one on linkedin, one in livejournal, one on advogato, and so on. The frustration that, when you join a photo site or a movie site or a travel site, you name it, you have to tell it who your friends are all over again. The separate Web sites, separate documents, are in fact about the same thing -- but the system doesn't know it." -- revolutionary idea of Giant Global Graph (GGG)

This is how Tim in his inimitably simple language attempts to shape the fate of the Internet (his brain child) !

So what all we need is a social graph reusing the same source of user data and maintaining relationships between user documents intelligently. This global graph is also referred as Semantic Web elevating the user experience one level up from Net and Web to GGG (Web 3.0). In a nutshell, it is not about revealing one's secure data to another rather allowing the user be connected to the data from peer sites (routing the nodes social graph) .

The graph is expressed in FOAF format. FOAF metadata can be interpreted by any other device/application which is part of the graph (Photo-sharing, travel sites, )
A public RDF URI of the FOAF document is exported for sharing across interested parties.
Gone are the days of the XML parser and DOM tree! Semantic Web evolving around RDF parser creating RDF graph in memory.

references :
FOAF and OpenID: two great tastes that taste great together by Dan Connolly
Whitelisting blog post by Sean B. Palmer

Tim pointed the tip of the iceberg .. to a new horizon of internet ... opened the floodgate of plethora of possibilities for SEMANTIC WEB !

Well technically speaking Web 3.0 = Web 2.0 + Semantic Web
The software metadata build on the existing value provided by social networks, folksonomies, and collaborative filtering that are already on the Web.

Read the story of Radar Network by the fascinating Nova Spivak

"Consider this scenario: Say you want to arrange a dinner at an upcoming conference. Today you might go through your address book and ping folks by e-mail to see who's attending. Then you probably send out e-mail invitations to dinner. You go back and forth with the group on the place and time, somehow you all agree, and then somebody makes a reservation. Files fly back and forth, with humans at the center.
In the semantic Web, your software agent will "know" in advance what's involved in arranging a dinner. Instead of you sending out a flurry of e-mails, the agent could cull the conference attendees and make a list of potential invitees.
It might also look through your address book to see which of your friends live in the city where the conference is being held. Once a list of potential dinner guests has been approved by you, the agent would negotiate the date and time with everyone else's agents via a calendar database, pick a restaurant from another database based on availability and your personal preferences, make the reservation, and send out directions. In a GPS-enabled world, it could even let you know how far a guest who is running late has to go."

There are many semantic magicians like Radar, Garlik, Metaweb Technologies, Powerset, and ZoomInfo !

Theories fast translating into practices in Semantic Web space ! Semantic Tech aims at encapsulating business domain knowledge used by many applications. This means that Semantic applications are thin because they work with “smart” data. All the business rules
logic is held in the models shared across applications.

A Semantic Web application is based on an architecture of various layers :
data capture and analysis, data merging, semantic modeling, display and deployment each step following the standards of knwledge model (RDF, RDFS, OWL, SWRL, and SPARQL)
Semantic models represent knowledge about the world in which the system operates.

A semantic application uses knowledge models in its operation. Using the models intelligently or "reasoning over the model"encompasses a very simple process of graph search to intricate inferencing over the model.

Lets understand the semantic buzzwords :

** Taxonomies are hierarchies that establish “parent-child” relationship between its concepts.
** Simple ontologies are just networks of connections; richer ontologies include rules and constraints governing these connections.
** Knowledge Models are different from Object Models :
** The Resource Description Framework (RDF) is a foundation for representing and processing metadata; it provides interoperability between applications that exchange machine readable information on the Web. RDF integrates a variety of applications from library catalogs and world-wide directories to syndication and aggregation of news, software, and content to personal collections of music, photos, and events using XML as interchange syntax.

** Web Ontology Language along with Resource Description Framework defines the Semantic Web.
** Social Neworking (who knows who) evolved into Semantic Network (who knows
what). The idea is to build reasoning on a task - Taskonomy.
** The intelligent agent assembles the recommendations and reasoning references
against the task/topic and presents to the users
> pull the artefacts associated with tags
> find similar questions - case based reasoning (who are all the ppl solved the same
problem)
> adds the new user as the context for the topic

** Semantic Graph containing nodes connecting human and resources (author +
document). It is a directed graph consisting of vertices, which represent concepts,
and edges, which represent semantic relations between the concepts.

** A dictionary of words labeled with semantic classes so associations can be drawn between words that have not previously been encountered while building a knowledge base.

So what are the semantic possibilities ?
reference : Ontology Modelling White Paper (TopQuadrant)

(1) Navigational Search
The idea is to use topical directories, or taxonomies, to help people narrow in on the general
neighborhood of the information they seek.

A Taxonomy includes user profiles, user goals and typical tasks performed is used to drive a
search engine. Multiple interrelated taxonomies are used to optimize information accessed by different stakeholders. Taxonomies and ontologies are used to suggest related subjects.

(2) Automated Content Tagger

semantic tags can be generated to make a document be "well known" by external systems so that search, integratation or invocation of other applications becomes more effective.
Tags are automatically inserted based on the computer analysis of the information, typically
using natural language analysis techniques. A predefined taxonomy or ontology of terms
and concepts is used to drive the analysis.

(3) Topic-based Search
To provide precise and concept-aware or task-oriented search capabilities specific to an area of
interest using knowledge representations across multiple knowledge sources both
structured and un-structured.

Knowledge model provides a way to map translation of queries to knowledge resources.

(4) Context-Aware Retriever
To retrieve knowledge from one or more systems that is highly relevant to an immediate context, through an action taken within a specific setting -- typically in a user interface. A user no longer needs to leave the application they are in to find the right information.

Knowledge model is used to represent context. This “profile” is then used to constrain a concept-based search.

(5) Expert Locator
To provide users with convenient access to experts in a given area who can help with problems, answer questions, locate and interpret specific documents, and collaborate on specific tasks. Knowing who is an expert in what can be difficult in an organization with a large workforce of experts. Expert Locator could also identify experts across organizational barriers.

The profiles of experts are expressed in a knowledge model. This can then be used to match concepts in queries to locate experts.

(6) Navigational Search
Use topical directories, or taxonomies, to help people narrow in on the general neighborhood of the information they seek.

These are just few mind blogging techniques !
Lets now see how the guru of intelligent machines materializing the dream of UNIFIED WORLD DATABASE ! Here is the real Web 3.0 machine !
reference : Newyork Times

The idea of a centralized database storing all of the world’s digital information is a fundamental shift away from today’s World Wide Web, which is akin to a library of linked digital documents stored separately on millions of computers where search engines serve as the equivalent of a card catalog.... information is structured in such a way so that software programs can discern relationships and even meaning.
For example, an entry for California’s governor, Arnold Schwarzenegger, would be entered as a topic that would include a variety of attributes or “views” describing him as an actor, athlete and politician — listing them in a highly structured way in the database

I searched 'Unicorn' and freebase organized the info in the most meaningful manner!

reference : Tim Oreilly
But once you understand a bit about what metaweb is doing, you realize just how remarkable it is. Metaweb has slurped in the contents of several of the web's freely accessible databases, including much of wikipedia, and song tracks from musicbrainz. It then turns its users loose on not just adding more data items but making connections between them by filling out meta tags that categorize or otherwise connect the data items, using a typology that can be extended by users, wiki-style.

Well now how'bout building the family tree of the whole world ! Sounds crazy ? Why not just get started @Geni .

Happy riding the semantic wave !

Musings on Technology

Sunday, March 9, 2008

Riding the Semantic Wave

No comments:

Julian Hyde on Streaming Data, Open Source OLAP. And stuff.

SpringSource Team Blog

Favourite Blog List

Adventures with Open Source BI

Wired Magazine

Twitter Engineering