Skip to content

Commit 1abb7c5

Browse files
theo-ogitbook-bot
authored andcommitted
GitBook: [master] 31 pages modified
1 parent d435821 commit 1abb7c5

30 files changed

+286
-81
lines changed

SUMMARY.md

+7-12
Original file line numberDiff line numberDiff line change
@@ -11,15 +11,8 @@
1111
* [Style Guide](services/ion/development/style-guide.md)
1212
* [Maintainer Workflow](services/ion/development/maintainer-workflow.md)
1313
* [Repository Maintenance](services/ion/development/repository-maintenance.md)
14-
* [Production](services/ion/production/README.md)
15-
* [Setup](services/ion/production/setup.md)
16-
* [Management Commands](services/ion/production/management-commands.md)
17-
* [Update Workflow](services/ion/production/update-workflow.md)
18-
* [Database](services/ion/production/database.md)
19-
* [User Experience](services/ion/user-experience/README.md)
20-
* [User Interface](services/ion/user-experience/user-interface.md)
21-
* [Architecture](services/ion/architecture/README.md)
22-
* [Architecture](services/ion/architecture/architecture.md)
14+
* [Production](services/ion/production.md)
15+
* [User Experience](services/ion/user-experience.md)
2316
* [Director](services/director/README.md)
2417
* [Development](services/director/development/README.md)
2518
* [Vagrant Setup](services/director/development/untitled.md)
@@ -32,8 +25,6 @@
3225
* [User Experience](services/director/user-experience/README.md)
3326
* [User Interface](services/director/user-experience/user-interface.md)
3427
* [Production](services/director/production/README.md)
35-
* [Management Commands](services/director/production/management-commands.md)
36-
* [Troubleshooting](services/director/production/troubleshooting.md)
3728
* [Setup](services/director/production/setup.md)
3829
* [Architecture](services/director/architecture/README.md)
3930
* [Architecture](services/director/architecture/architecture.md)
@@ -141,6 +132,7 @@
141132
* [Postfix](technologies/mail/postfix.md)
142133
* [Dovecot](technologies/mail/dovecot.md)
143134
* [Monitoring](technologies/monitoring/README.md)
135+
* [Prometheus](technologies/monitoring/prometheus.md)
144136
* [Grafana](technologies/monitoring/grafana.md)
145137
* [Sentry](technologies/monitoring/sentry.md)
146138
* [Uptime Robot](technologies/monitoring/uptime-robot.md)
@@ -221,11 +213,12 @@
221213
* [NOR](machines/racks/nor.md)
222214
* [General](general/README.md)
223215
* [Sysadmins List](general/sysadmins-list.md)
216+
* [Structure](general/structure/README.md)
217+
* [Organization](general/structure/organization.md)
224218
* [Documentation](general/documentation/README.md)
225219
* [Security](general/documentation/security.md)
226220
* [Communication](general/communication.md)
227221
* [Understudies](general/understudies.md)
228-
* [Responsibility Assignment](general/blame-assignment.md)
229222
* [Account Structure](general/tjhsst-account-guide.md)
230223
* [Machine Room](general/machine-room.md)
231224
* [Branding](general/branding.md)
@@ -249,4 +242,6 @@
249242
* [Policies](policies/README.md)
250243
* [Data Release Policy](policies/data-release-policy.md)
251244
* [Upgrade Policy](policies/upgrade-policy.md)
245+
* [Account Policy](policies/account-policy.md)
246+
* [Election Policy](policies/student-leadership-election-policy.md)
252247

general/README.md

+21
Original file line numberDiff line numberDiff line change
@@ -2,3 +2,24 @@
22

33
The **Student Systems Administrators** \(also known as **sysadmins**\) are a group of students who have distinguished themselves to be technologically competent as well as exceptionally responsible and dependable. These students are responsible for all of the Computer Systems Lab's infrastructure. These administrators, however, are students first and administrators second. The goal of the administrator program is not to create a perfect system but to give the student system administrators valuable experience in how a real networked environment works.
44

5+
## Mission
6+
7+
The mission of the Sysadmin program is to provide real-world experience with a production environment for interested students at TJ, while supporting the school and, in particular, the Computer Systems Lab in its mission to "provide students with a challenging learning environment focused on math, science, and technology, to inspire joy at the prospect of discovery, and to foster a culture of innovation based on ethical behavior and the shared interests of humanity." Working in collaboration with other stakeholders, we aim to provide the students of TJ with an unparalleled educational experience.
8+
9+
To that end, we currently maintain the infrastructure behind Ion \(school-wide management application for TJ's embedded activities period\), Director \(website management application used by TJ's clubs and web development classes\), mailservers for students, high performance clusters, and other services to support TJ.
10+
11+
## Our Values
12+
13+
We are:
14+
15+
* **Responsible:** We take great pride in our systems and its high availability. We recognize that many community members rely on our services as part of their daily life, and we strive to ensure that our services remain available for their use.
16+
* **Collaborative:** Teamwork and collaboration are important for us to serve our mission. Thus, we strive to maintain a collaborative culture. No one can do anything alone.
17+
* **Open Source**: To prevent vendor lock-in and contribute to the "shared interests of humanity", we take great pride in supporting and using open source and free-software projects and technologies. To the extent possible, we aim to use free software in the Lab, together with software that promotes open standards and interoperability.
18+
* **Results-Oriented:** We expect Sysadmins to take ownership on tasks they are assigned to and be proactive about seeing them through.
19+
* **Iterative**: We know that our services, infrastructure, configuration, and architecture always have room for improvement, and we believe in the power of constant iteration. We empower individual Sysadmins to raise concerns, propose improvements, test improvements in a controlled fashion, and research ways to improve the Lab's services, architecture, software, or infrastructure.
20+
* **Respectful:** We are respectful of others background and beliefs, treating each other with dignity and respect.
21+
* **Deliberative:** We recognize the critical importance of careful decision-making, especially as it applies to our infrastructure. Therefore, we take great care to listen to all perspectives and consider all implications before making decisions.
22+
* **Independent:** Although we function within TJ, we aim to make decisions independently of others as it best serves the Computer Systems Lab. This does not mean that we do not solicit feedback, just that we make technical and architectural decisions as a team.
23+
* **Agile:** We focus on delivering results, not processes. We use processes to keep improving, prevent mistakes, and comply with the rules, not to bog up work.
24+
* **Efficient:** We strive to reduce human interaction with our systems when not necessary.
25+

general/blame-assignment.md

-10
This file was deleted.

general/documentation/README.md

+5-1
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
## Overview
44

5-
Documentation in the Computer Systems Lab is done through GitBook.
5+
Overall documentation in the Computer Systems Lab is done through GitBook.
66

77
{% page-ref page="../../technologies/tools/gitbook.md" %}
88

@@ -31,6 +31,10 @@ In the GitBook web editor, use `Ctrl+/` to display various commands. Read the Gi
3131

3232
In the GitHub repo, make commits as you work in Markdown.
3333

34+
## Runbooks
35+
36+
37+
3438
## Writing Good Documentation
3539

3640
Here are a few pieces of advice to writing good documentation:

general/structure/README.md

+2
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
# Structure
2+

general/structure/organization.md

+141
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,141 @@
1+
# Organization
2+
3+
## Teams
4+
5+
Work among the Sysadmins are split among a variety of teams, each working on a specific area of the Lab.
6+
7+
### Ion
8+
9+
The Ion team is responsible for the administration, maintenance, and development of [Ion](../../services/ion/).
10+
11+
### Director
12+
13+
The Director team is responsible for the administration, maintenance, and development of [Director](../../services/director/). They also are responsible for ensuring the high availability of websites hosted on Director.
14+
15+
### Web Services
16+
17+
The Web Services \(or WWW\) team is responsible for maintaining the web presence of the Lab not supervised by other teams. This includes tjhsst.edu and sysadmins.tjhsst.edu. They are also responsible for \*.tjhsst.edu domains not supervised by other teams or by the Infrastructure Lead. They are also responsible for managing TJ's proxy configuration file.
18+
19+
### Mail
20+
21+
The Mail team is responsible for maintaining TJ's mail servers, list servers, and webmail clients \(shared with Web Services\).
22+
23+
### Signage
24+
25+
The Signage team is responsible for maintaining TJ's [Signage displays](../../services/signage/). They work closely with the Ion team in this regard.
26+
27+
### Networking
28+
29+
The Networking team is responsible for managing the [CSL's network infrastructure](../../technologies/networking/), including switches, networking connections, [OpenVPN](../../technologies/networking/openvpn.md), [NTP](../../technologies/networking/ntp.md), [DNS](../../technologies/networking/dns/), and [DHCP](../../technologies/networking/dhcp.md). They are responsible for the smooth flow of network traffic. They are also the point persons when diagnosing networking connections on CSL systems.
30+
31+
### Monitoring
32+
33+
The Monitoring team is responsible for [observability in the CSL](../../technologies/monitoring/), including logging, alerts, and metrics. They are responsible for maintaining systems that provide monitoring capability such as [Grafana](../../technologies/monitoring/grafana.md) and Prometheus.
34+
35+
### Storage
36+
37+
The Storage team is responsible for the storage of data in the Lab including [Ceph](../../technologies/storage/ceph/) and [OpenAFS](../../technologies/storage/afs/openafs.md). They are also responsible for the CSL's data backups.
38+
39+
### Documentation
40+
41+
The Documentation team is responsible for accurate, comprehensive, and well-written documentation for the Sysadmins. They assist and strongly encourage other teams in documenting everything in both our [Runbooks](../documentation/#runbooks) and this [Docsite](../documentation/).
42+
43+
### Academic Services
44+
45+
The Academic Services team is responsible for maintaining software that is used by TJHSST classes. This includes [Othello](../../services/othello/), the TJHSST AI Grader, and Tin. Due to the presence of many services, there may be a sub-team for each service.
46+
47+
### Printing
48+
49+
The Printing team is responsible for [printing operations in the Lab](../../services/printing/), including the CUPS server and the printers.
50+
51+
### Cluster
52+
53+
The Cluster team is responsible for maintaing [TJ's clusters](../../services/cluster/) \(Borg and HPC\).
54+
55+
### Advanced Computing Hardware
56+
57+
The Advanced Computing Hardware Team is responsible for the maintenance of hardware within the Lab used for advanced computing, including GPUs.
58+
59+
### Understudy Coordinator
60+
61+
The Understudy Coordinator is responsible for leading the [Understudy program](../understudies.md). The Coordinator is responsible for primarily planning the structure and activities with the Understudy program.
62+
63+
## Infrastructure Lead
64+
65+
The Infrastructure Lead is one of the Lead Sysadmins who is responsible for broadly supervising all facets of the Lab's infrastructure.
66+
67+
The Infrastructure Lead is also responsible for:
68+
69+
* prioritizing work among the Sysadmins
70+
* allocating work among the teams
71+
* ensuring work is done in a timely manner
72+
* ensuring best security practices and policies in the Lab
73+
* setting abuse guidelines
74+
* spearheading automation efforts
75+
* maintaining the GitLab issue tracker
76+
77+
The Infrastructure Lead provides recommendations and feedback on changes to the Lab's architecture or to substantial technical changes.
78+
79+
The Infrastructure Lead is **NOT** a person who takes on all responsibility. Instead, the Lead delegates work. authority, and responsibility to other teams and people.
80+
81+
**Qualifications:**
82+
83+
* Has an extraordinary knowledge of the Lab and the relationship between its software, services, and technologies
84+
* Has a broad range of expertise touching various aspects of the Lab's infrastructure
85+
* Has shown an extraordinary level of dedication to the program and its mission/values
86+
* Is organized
87+
88+
**Responsibilities**
89+
90+
* LDAP
91+
* Kerberos
92+
* VM servers
93+
* iLOs
94+
* Security
95+
* CSL architectural decisions
96+
97+
## Lead Sysadmins
98+
99+
The Lead Sysadmins make up the Sysadmin Leadership Team together with the Faculty Sponsor and are the final decision-makers in the Sysadmins. They make the final call with respect to team organization/membership, access requests, and all decisions related to the Lab.
100+
101+
They are appointed by the outgoing Lead Sysadmins with approval from the Faculty Sponsor.
102+
103+
In another sense, the Lead Sysadmins are the Presidents. They may appoint Junior Lead Sysadmins \(Vice Presidents\), if those people are expected to become Lead Sysadmins in the next year.
104+
105+
### Senior Sysadmins
106+
107+
Senior Sysadmins are sysadmins who are seniors. By virtue of being a senior, they have no additional rights or responsibilities. Instead, by virtue of having served in the Lab for a long time, they often have the most experience in a specific area.
108+
109+
## Team Structure
110+
111+
**Lead\(s\)**
112+
113+
Leads are the Directly Responsible Individuals by default on a team. They are responsible for serving as the primary point of contact with respect to the team. If there is an incident relating to their team, the Lead\(s\) must be the one to report it.
114+
115+
Leads should:
116+
117+
* stay apprised of their team's work
118+
* have extensive knowledge of the team's functional area
119+
* supervise the work done by their team members
120+
* report on their work to the broader Sysadmin team
121+
122+
> Apple coined the term "directly responsible individual" \(DRI\) to refer to the one person with whom the buck stopped on any given project. The idea is that every project is assigned a DRI who is ultimately held accountable for the success \(or failure\) of that project.
123+
>
124+
> They likely won't be the only person working on their assigned project, but it's "up to that person to get it done or find the resources needed."
125+
>
126+
> ... What's most important is that they're empowered.
127+
128+
> Source: [https://about.gitlab.com/handbook/people-operations/directly-responsible-individuals/](https://about.gitlab.com/handbook/people-operations/directly-responsible-individuals/)
129+
130+
**Deputy**
131+
132+
The Deputy \(or Deputy Lead\) is a backup to to the Lead\(s\) and defers to their opinion. If the Lead\(s\) is/are not available, the Deputy should be able to temporarily take over. A Deputy is only appointed if the Sysadmin has demonstrated competence and trust that would make him/her already qualified to be a Lead.
133+
134+
**Backup**
135+
136+
If their is no Deputy, a team has a Backup, who would be someone that can step in for a Lead in the Lead's absence. The Backup is generally a Lead Sysadmin or a Sysadmin who has previously led that team.
137+
138+
**Team Members**
139+
140+
Team members are people who significantly contribute to a team's goals. Passive involvement does not mean that someone is a team member. They operate under the direction of the Team Lead.
141+

general/understudies.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Understudies
22

3-
The Student Systems Administrator Understudy program is designed to help prepare interested students as Systems Administrator. The Understudy program has existed in the Lab since before the creation of Livedoc.
3+
The Student Systems Administrator Understudy program is designed to help prepare interested students to be Systems Administrator. The Understudy program has existed in the Lab since before the creation of Livedoc.
44

55
## History
66

policies/account-policy.md

+18
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
# Account Policy
2+
3+
## Introduction
4+
5+
The CSL maintains accounts for students and staff at TJHSST. This document governs authentication in the CSL.
6+
7+
## Policy
8+
9+
### General
10+
11+
* All TJ students are issued a TJ CSL account with a username in the form of `20XXauser` \(the graduation year, first initial, and first seven letters of the last name\)
12+
* Should someone share someone else's username by this convention those later in the alphabet shall be issued a username with an incrementing number at the end
13+
* For example, if the class of 2020 has Jason Doe, John Doe, and Julian Doe, Jason Doe shall be `2020jdoe`, John Doe shall be `2020jdoe1` , and Julian Doe shall be `2020jdoe2`
14+
* If a Jane Doe enters TJ as a sophomore, she shall be issued `2020jdoe3`
15+
* By request, TJ staff members may request a TJ CSL account that shall be the same as their FCPS employee username.
16+
* The Sysadmins shall terminate a TJ CSL account for TJ staff members once their relationship with the school is terminated.
17+
* All activities on the CSL network shall be governed as per the FCPS Acceptable Use Policy.
18+

0 commit comments

Comments
 (0)