Semantic Web
Introduction
Manu Sporny has a bunch of very approachable videos on the basics of semantic web and linked data:
They are a bit outdated, but they're an easy watch.
Two-Bit History has a great article on the history of the semantic web and linked data. It might help you put the different technologies into perspective.
Random takeaways
From this SO answer: the semantic web is founded upon RDF. RDFa is an RDF serialization.
This paper by Ruben Verborgh provides a nice overview of linked data and machine readable web.
Microformat
A microformat (sometimes abbreviated μF) is a World Wide Web-based approach to semantic markup which uses HTML/XHTML tags supported for other purposes to convey additional metadata and other attributes in web pages and other contexts that support (X)HTML, such as RSS. This approach allows software to process information intended for end-users (such as contact information, geographic coordinates, calendar events, and similar information) automatically.
Works by adding agreed upon semantic information inside regular HTML attributes, like so:
<div class="h-resume"><span class="p-name"><a class="p-contact h-card" href="http://example.org"><img src="http://example.org/photo.png" alt="" />Joe Bloggs</a>resume</span><p class="p-summary">Joe is a top-notch llama farmer with a degree in<span class="p-skill">Llama husbandry</span> and a thirst to produce thefinest wool known to man.</p></div>
RDFa
Other projects and resources
Linked data
Tim BL's Information Management: A Proposal
Jeff "Coding Machine" Zucker created sparql-fiddle, great for learning about writing and querying linked data.
Peter Norvig's The Unreasonable Effectiveness of Data seems to be a classic (video and paper (PDF)).
A huge amount of Linked Data is available on the Web. But can live applications use it? SPARQL endpoints are expensive for the server, and not always available for all datasets. Downloadable dumps are expensive for clients, and do not allow live querying on the Web.
With Linked Data Fragments, and specifically the Triple Pattern Fragments interface, we aim to explore what happens when we redistribute the load between clients and servers. We then measure the impact of such interfaces on clients, servers, and caches.
Ontologies and vocabularies
- Value Flows is a set of common vocabularies to describe flows of economic resources of all kinds within distributed economic ecosystems.
ETL and data scraping
Memex programme index, a list of memex-related tools and their repository URLs.
Karma is an information integration tool that enables users to quickly and easily integrate data from a variety of data sources including databases, spreadsheets, delimited text files, XML, JSON, KML and Web APIs. Users integrate information by modeling it according to an ontology of their choice using a graphical user interface that automates much of the process. Karma learns to recognize the mapping of data to ontology classes and then uses the ontology to propose a model that ties together these classes. Users then interact with the system to adjust the automatically generated model. During this process, users can transform the data as needed to normalize data expressed in different formats and to restructure it. Once the model is complete, users can published the integrated data as RDF or store it in a database.