COLLECTED BY
Organization:
Internet Archive
The Internet Archive discovers and captures web pages through many different web crawls.
At any given time several distinct crawls are running, some for months, and some every day or longer.
View the web archive through the
Wayback Machine.
The seed for Wide00014 was:
- Slash pages from every domain on the web:
-- a list of domains using Survey crawl seeds
-- a list of domains using Wide00012 web graph
-- a list of domains using Wide00013 web graph
- Top ranked pages (up to a max of 100) from every linked-to domain using the Wide00012 inter-domain navigational link graph
-- a ranking of all URLs that have more than one incoming inter-domain link (rank was determined by number of incoming links using Wide00012 inter domain links)
-- up to a maximum of 100 most highly ranked URLs per domain
The seed list contains a total of 431,055,452 URLsThe seed list was further filtered to exclude known porn, and link farm, domainsThe modified seed list contains a total of 428M URLs
The Wayback Machine - https://web.archive.org/web/20160821233344/https://coreos.com/etcd/docs/2.2.2/index.html
These docs cover everything from setting up and running an etcd cluster to using etcd in your applications. Improvements to these docs are encouraged via etcd on GitHub. For more in-depth support, jump into #coreos on IRC, email the dev list, or file a bug.
Configuration values are distributed within the cluster for your applications to read. Values can be changed programatically and smart applications can reconfigure automatically. You'll never again have to run a configuration management tool on every machine in order to change a single config value.
Reading and writing into the etcd keyspace is done via a simple, RESTful HTTP API, or using language-specific libraries that wrap the HTTP API with higher level primitives.
High level design goals, in-progress features, and RFCs are tracked on GitHub.