|
| 1 | +# Organization |
| 2 | + |
| 3 | +## Teams |
| 4 | + |
| 5 | +Work among the Sysadmins are split among a variety of teams, each working on a specific area of the Lab. |
| 6 | + |
| 7 | +### Ion |
| 8 | + |
| 9 | +The Ion team is responsible for the administration, maintenance, and development of [Ion](../../services/ion/). |
| 10 | + |
| 11 | +### Director |
| 12 | + |
| 13 | +The Director team is responsible for the administration, maintenance, and development of [Director](../../services/director/). They also are responsible for ensuring the high availability of websites hosted on Director. |
| 14 | + |
| 15 | +### Web Services |
| 16 | + |
| 17 | +The Web Services \(or WWW\) team is responsible for maintaining the web presence of the Lab not supervised by other teams. This includes tjhsst.edu and sysadmins.tjhsst.edu. They are also responsible for \*.tjhsst.edu domains not supervised by other teams or by the Infrastructure Lead. They are also responsible for managing TJ's proxy configuration file. |
| 18 | + |
| 19 | +### Mail |
| 20 | + |
| 21 | +The Mail team is responsible for maintaining TJ's mail servers, list servers, and webmail clients \(shared with Web Services\). |
| 22 | + |
| 23 | +### Signage |
| 24 | + |
| 25 | +The Signage team is responsible for maintaining TJ's [Signage displays](../../services/signage/). They work closely with the Ion team in this regard. |
| 26 | + |
| 27 | +### Networking |
| 28 | + |
| 29 | +The Networking team is responsible for managing the [CSL's network infrastructure](../../technologies/networking/), including switches, networking connections, [OpenVPN](../../technologies/networking/openvpn.md), [NTP](../../technologies/networking/ntp.md), [DNS](../../technologies/networking/dns/), and [DHCP](../../technologies/networking/dhcp.md). They are responsible for the smooth flow of network traffic. They are also the point persons when diagnosing networking connections on CSL systems. |
| 30 | + |
| 31 | +### Monitoring |
| 32 | + |
| 33 | +The Monitoring team is responsible for [observability in the CSL](../../technologies/monitoring/), including logging, alerts, and metrics. They are responsible for maintaining systems that provide monitoring capability such as [Grafana](../../technologies/monitoring/grafana.md) and Prometheus. |
| 34 | + |
| 35 | +### Storage |
| 36 | + |
| 37 | +The Storage team is responsible for the storage of data in the Lab including [Ceph](../../technologies/storage/ceph/) and [OpenAFS](../../technologies/storage/afs/openafs.md). They are also responsible for the CSL's data backups. |
| 38 | + |
| 39 | +### Documentation |
| 40 | + |
| 41 | +The Documentation team is responsible for accurate, comprehensive, and well-written documentation for the Sysadmins. They assist and strongly encourage other teams in documenting everything in both our [Runbooks](../documentation/#runbooks) and this [Docsite](../documentation/). |
| 42 | + |
| 43 | +### Academic Services |
| 44 | + |
| 45 | +The Academic Services team is responsible for maintaining software that is used by TJHSST classes. This includes [Othello](../../services/othello/), the TJHSST AI Grader, and Tin. Due to the presence of many services, there may be a sub-team for each service. |
| 46 | + |
| 47 | +### Printing |
| 48 | + |
| 49 | +The Printing team is responsible for [printing operations in the Lab](../../services/printing/), including the CUPS server and the printers. |
| 50 | + |
| 51 | +### Cluster |
| 52 | + |
| 53 | +The Cluster team is responsible for maintaing [TJ's clusters](../../services/cluster/) \(Borg and HPC\). |
| 54 | + |
| 55 | +### Advanced Computing Hardware |
| 56 | + |
| 57 | +The Advanced Computing Hardware Team is responsible for the maintenance of hardware within the Lab used for advanced computing, including GPUs. |
| 58 | + |
| 59 | +### Understudy Coordinator |
| 60 | + |
| 61 | +The Understudy Coordinator is responsible for leading the [Understudy program](../understudies.md). The Coordinator is responsible for primarily planning the structure and activities with the Understudy program. |
| 62 | + |
| 63 | +## Infrastructure Lead |
| 64 | + |
| 65 | +The Infrastructure Lead is one of the Lead Sysadmins who is responsible for broadly supervising all facets of the Lab's infrastructure. |
| 66 | + |
| 67 | +The Infrastructure Lead is also responsible for: |
| 68 | + |
| 69 | +* prioritizing work among the Sysadmins |
| 70 | +* allocating work among the teams |
| 71 | +* ensuring work is done in a timely manner |
| 72 | +* ensuring best security practices and policies in the Lab |
| 73 | +* setting abuse guidelines |
| 74 | +* spearheading automation efforts |
| 75 | +* maintaining the GitLab issue tracker |
| 76 | + |
| 77 | +The Infrastructure Lead provides recommendations and feedback on changes to the Lab's architecture or to substantial technical changes. |
| 78 | + |
| 79 | +The Infrastructure Lead is **NOT** a person who takes on all responsibility. Instead, the Lead delegates work. authority, and responsibility to other teams and people. |
| 80 | + |
| 81 | +**Qualifications:** |
| 82 | + |
| 83 | +* Has an extraordinary knowledge of the Lab and the relationship between its software, services, and technologies |
| 84 | +* Has a broad range of expertise touching various aspects of the Lab's infrastructure |
| 85 | +* Has shown an extraordinary level of dedication to the program and its mission/values |
| 86 | +* Is organized |
| 87 | + |
| 88 | +**Responsibilities** |
| 89 | + |
| 90 | +* LDAP |
| 91 | +* Kerberos |
| 92 | +* VM servers |
| 93 | +* iLOs |
| 94 | +* Security |
| 95 | +* CSL architectural decisions |
| 96 | + |
| 97 | +## Lead Sysadmins |
| 98 | + |
| 99 | +The Lead Sysadmins make up the Sysadmin Leadership Team together with the Faculty Sponsor and are the final decision-makers in the Sysadmins. They make the final call with respect to team organization/membership, access requests, and all decisions related to the Lab. |
| 100 | + |
| 101 | +They are appointed by the outgoing Lead Sysadmins with approval from the Faculty Sponsor. |
| 102 | + |
| 103 | +In another sense, the Lead Sysadmins are the Presidents. They may appoint Junior Lead Sysadmins \(Vice Presidents\), if those people are expected to become Lead Sysadmins in the next year. |
| 104 | + |
| 105 | +### Senior Sysadmins |
| 106 | + |
| 107 | +Senior Sysadmins are sysadmins who are seniors. By virtue of being a senior, they have no additional rights or responsibilities. Instead, by virtue of having served in the Lab for a long time, they often have the most experience in a specific area. |
| 108 | + |
| 109 | +## Team Structure |
| 110 | + |
| 111 | +**Lead\(s\)** |
| 112 | + |
| 113 | +Leads are the Directly Responsible Individuals by default on a team. They are responsible for serving as the primary point of contact with respect to the team. If there is an incident relating to their team, the Lead\(s\) must be the one to report it. |
| 114 | + |
| 115 | +Leads should: |
| 116 | + |
| 117 | +* stay apprised of their team's work |
| 118 | +* have extensive knowledge of the team's functional area |
| 119 | +* supervise the work done by their team members |
| 120 | +* report on their work to the broader Sysadmin team |
| 121 | + |
| 122 | +> Apple coined the term "directly responsible individual" \(DRI\) to refer to the one person with whom the buck stopped on any given project. The idea is that every project is assigned a DRI who is ultimately held accountable for the success \(or failure\) of that project. |
| 123 | +> |
| 124 | +> They likely won't be the only person working on their assigned project, but it's "up to that person to get it done or find the resources needed." |
| 125 | +> |
| 126 | +> ... What's most important is that they're empowered. |
| 127 | +
|
| 128 | +> Source: [https://about.gitlab.com/handbook/people-operations/directly-responsible-individuals/](https://about.gitlab.com/handbook/people-operations/directly-responsible-individuals/) |
| 129 | +
|
| 130 | +**Deputy** |
| 131 | + |
| 132 | +The Deputy \(or Deputy Lead\) is a backup to to the Lead\(s\) and defers to their opinion. If the Lead\(s\) is/are not available, the Deputy should be able to temporarily take over. A Deputy is only appointed if the Sysadmin has demonstrated competence and trust that would make him/her already qualified to be a Lead. |
| 133 | + |
| 134 | +**Backup** |
| 135 | + |
| 136 | +If their is no Deputy, a team has a Backup, who would be someone that can step in for a Lead in the Lead's absence. The Backup is generally a Lead Sysadmin or a Sysadmin who has previously led that team. |
| 137 | + |
| 138 | +**Team Members** |
| 139 | + |
| 140 | +Team members are people who significantly contribute to a team's goals. Passive involvement does not mean that someone is a team member. They operate under the direction of the Team Lead. |
| 141 | + |
0 commit comments