meta data for this page
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revisionNext revisionBoth sides next revision | ||
doc_trecs:software_interface [2022/04/26 11:59] – vor | doc_trecs:software_interface [2023/11/24 14:31] – bil | ||
---|---|---|---|
Line 1: | Line 1: | ||
====== Software interface ====== | ====== Software interface ====== | ||
- | There are several software interfaces available to monitor the status of the t.RECS< | + | There are several software interfaces available to monitor the status of the t.RECS< |
===== Management WebGUI ===== | ===== Management WebGUI ===== | ||
Line 13: | Line 13: | ||
|{{ : | |{{ : | ||
- | Figure 1 shows the first call of the Management Web****GUI. It is organized into three columns. The first is on the left-hand side and contains the following: | + | On the left side is a menu, which can be toggled by clicking the menu button in the upper left corner |
- | [[documentation: | + | [[doc_trecs: |
- | [[documentation: | + | [[doc_trecs: |
- | [[documentation: | + | [[doc_trecs: |
- | [[documentation: | + | [[doc_trecs: |
+ | [[doc_trecs: | ||
+ | [[doc_trecs: | ||
+ | [[doc_trecs: | ||
+ | [[doc_trecs:software_interface# | ||
+ | [[doc_trecs: | ||
- | The second colum contains the buttons and sliders to manipulate the system. While the third colum is mostly for history information like power usage and temperature graphs. | + | ==== Dashboard ==== |
- | ==== Overview ==== | + | The Dashboard is seen first when opening the Web****GUI and displays the summarized system health status. |
- | All units that are installed in the rack and that are managed by the software are summarized on this page. | + | < |
- | The total power usage is summed up over all managed units. | + | {{: |
- | + | ||
- | < | + | |
- | {{: | + | |
==== Management ==== | ==== Management ==== | ||
- | An overview of the selected unit can be seen in this tab. The fans can be regulated by dragging | + | In this view, nodes can be turned on or off with a quick menu, which opens when clicking on the gear symbol of a node.\\ |
+ | Multiple | ||
+ | The view also shows fan monitoring data and allows a detailed look at the temperature map of the system' | ||
+ | By clicking | ||
+ | Furthermore the view displays the summarized system health status. | ||
- | < | + | < |
+ | {{:doc_trecs:t.recs_management.png?direct|Management View}}</ | ||
- | A quick menu to control a node can be opened by klicking on the gear next to an CXP node. In this menu the node can be switched on and off and the KVM can be switched to the node. | ||
- | < | + | === Node Management |
- | <WRAP round tip> | + | This view features controlling the power state of the selected node and monitoring its detailed status values and graphs.\\ |
- | Apalis nodes do not show a management pop-up button due to size constraints.\\ Click on the node button while pressing the " | + | It is also possible |
- | </ | + | If the node is running and the [[documentation: |
- | <WRAP round tip> | + | <imgcaption Node Management View|> |
- | When pressing the " | + | {{: |
- | </WRAP> | + | |
- | === Node management === | ||
- | On this page the selected node can be controlled and detailed status values and graphs can be seen.\\ | + | === Network === |
- | By klicking on the arrow, pointing downwards in the upper bar next to the nodename, the other nodes of the unit can be chosen. | + | |
- | < | + | The network |
+ | In addition to that, VLANs of the node network can be configured and assigned to the ports of the nodes and the backpanel. | ||
- | ==== Global settings ==== | + | < |
+ | {{: | ||
- | All IP< | ||
- | The firmware for the whole RECS< | ||
- | For the update process **all modules will be powered off!**\\ | ||
- | < | + | === Composition === |
- | ==== Log viewer ==== | + | This view allows the configuration of the PCIe resources in the form of composed nodes.\\ |
+ | A composed node is a reserved bundle of resources, which utilize PCIe functions.\\ | ||
+ | A wizard leads through the process of creating such composed nodes. | ||
- | In the system healths tab of the log page the status changes of the sensors, fan and boards can be seen. | + | < |
+ | {{: | ||
- | < | ||
- | In the java tab of the log page all messages regarding the software can be found. | + | === Users === |
- | < | + | This view features the user management. Users can be created, edited or deleted.\\ |
+ | Additionally, | ||
+ | < | ||
+ | {{: | ||
+ | |||
+ | |||
+ | === Settings === | ||
+ | |||
+ | This view allows changing system-wide preferences (e.g. regarding the interfaces of the system). | ||
+ | |||
+ | < | ||
+ | {{: | ||
+ | |||
+ | |||
+ | === Time === | ||
+ | |||
+ | Here, the system time can be set either manually or using NTP. | ||
+ | |||
+ | < | ||
+ | {{: | ||
+ | |||
+ | |||
+ | === Firmware === | ||
+ | |||
+ | This view shows the currently installed versions of the firmware and management software.\\ | ||
+ | Furthermore, | ||
+ | |||
+ | < | ||
+ | {{: | ||
+ | |||
+ | |||
+ | === Logs === | ||
+ | |||
+ | In the System Events tab of this view, the status changes of the sensors, fan and boards can be seen.\\ | ||
+ | In the Java Messages tab , all messages regarding the software can be found.\\ | ||
Several filters can be set for both tabs at the top.\\ | Several filters can be set for both tabs at the top.\\ | ||
- | At the bottom the whole log can be downloaded as a ZIP file containing the individual logfiles. | + | The whole log can be downloaded as a ZIP file containing the individual logfiles. |
+ | |||
+ | < | ||
+ | {{: | ||
===== Redfish API ===== | ===== Redfish API ===== | ||
- | The documentation of the RECS< | + | The management software also features a Redfish API.\\ |
+ | The documentation | ||
===== REST API ===== | ===== REST API ===== | ||
Line 84: | Line 127: | ||
==== Access ==== | ==== Access ==== | ||
- | The RECS< | + | The REST API is accessible via the management |
Accessing the REST API requires HTTP Basic authentication. The authenticated user has to be in the " | Accessing the REST API requires HTTP Basic authentication. The authenticated user has to be in the " | ||
Line 90: | Line 133: | ||
==== Components ==== | ==== Components ==== | ||
- | The RECS< | + | The REST API makes all hardware components in the cluster available as XML trees in software. The following components are supported by the API: \\ |
^ Attribute ^ Description ^ | ^ Attribute ^ Description ^ | ||
- | |'' | + | |'' |
- | |'' | + | |
|'' | |'' | ||
- | |'' | + | |'' |
- | |'' | + | |
- | Many resources also return lists of components. These are named according to the scheme < | + | === RCU === |
- | Example | + | The main entrypoint |
- | < | + | Request: |
- | < | + | < |
- | lastSensorUpdate=" | + | curl -X GET -k -i https://host/REST/rcu |
- | < | + | </ |
- | < | + | |
- | < | + | |
- | < | + | |
- | < | + | |
- | </ | + | |
- | </ | + | |
- | === Node === | + | Response: |
+ | <code xml> | ||
+ | <rcu name=" | ||
+ | < | ||
+ | <sensor name="Node average temperature" | ||
+ | <sensor name=" | ||
+ | <sensor name=" | ||
+ | </ | ||
+ | < | ||
+ | < | ||
+ | < | ||
+ | < | ||
+ | < | ||
+ | < | ||
+ | < | ||
+ | < | ||
+ | < | ||
+ | < | ||
+ | < | ||
+ | <sensor name=" | ||
+ | <sensor name=" | ||
+ | <sensor name=" | ||
+ | </ | ||
+ | </ | ||
+ | </ | ||
- | Example XML: | + | Attributes: \\ |
- | + | ||
- | <code xml>< | + | |
- | actualPEGPowerUsage=" | + | |
- | baseboardId=" | + | |
- | lastSensorUpdate=" | + | |
- | highestTemperature=" | + | |
- | + | ||
- | The following table shows the possible attributes (some are optional) and their meaning: \\ | + | |
^ Attribute ^ Description ^ Unit ^ Data type ^ | ^ Attribute ^ Description ^ Unit ^ Data type ^ | ||
- | |'' | + | |'' |
- | |'' | + | |'' |
- | |'' | + | |'' |
- | |'' | + | |'' |
- | |'' | + | |'' |
- | |'' | + | |'' |
- | |'' | + | |
- | |'' | + | |
- | |'' | + | |
- | |'' | + | |
- | |'' | + | |
- | |'' | + | |
- | |'' | + | |
- | |'' | + | |
|'' | |'' | ||
- | |'' | + | |'' |
- | |'' | + | |'' |
- | In accordance to the component node the API offers nodeList which returns multiple instances of node. | + | Nested elements: \\ |
- | === Backplane | + | ^ Element ^ Description ^ Unit ^ Data type ^ |
+ | |'' | ||
+ | |'' | ||
+ | |'' | ||
+ | |'' | ||
+ | |'' | ||
+ | |||
+ | |||
+ | === Baseboard | ||
- | Example XML: | + | Request: |
+ | <code bash> | ||
+ | curl -X GET -k -i https:// | ||
+ | </ | ||
- | <code xml><backplane position=" | + | Response: |
- | lastSensorUpdate="1465470151268"> | + | <code xml> |
- | <temperatures>24.0</temperatures> | + | <baseboard type=" |
- | <temperatures> | + | < |
- | <temperatures>26.0</temperatures> | + | < |
- | <temperatures>27.0</temperatures> | + | < |
- | <temperatures>28.0</temperatures> | + | < |
- | </backplane></ | + | < |
+ | < | ||
+ | < | ||
+ | < | ||
+ | < | ||
+ | < | ||
+ | <sensor name="Baseboard infrastructure power" unit=" | ||
+ | <sensor name="Baseboard power usage (Node + PEG)" unit=" | ||
+ | < | ||
+ | </power> | ||
+ | < | ||
+ | <sensor name=" | ||
+ | <sensor name=" | ||
+ | <sensor name=" | ||
+ | <sensor name=" | ||
+ | <sensor name=" | ||
+ | <sensor name=" | ||
+ | <sensor name=" | ||
+ | <sensor name=" | ||
+ | <sensor name=" | ||
+ | </temperature> | ||
+ | < | ||
+ | <sensor name=" | ||
+ | <sensor name=" | ||
+ | <sensor name=" | ||
+ | </ | ||
+ | </ | ||
+ | </ | ||
- | The attributes have the following meaning: \\ | + | Attributes: \\ |
^ Attribute ^ Description ^ Unit ^ Data type ^ | ^ Attribute ^ Description ^ Unit ^ Data type ^ | ||
- | |'' | + | |'' |
- | |'' | + | |'' |
- | |'' | + | |'' |
|'' | |'' | ||
- | |'' | + | |'' |
+ | |'' | ||
- | In accordance to the component backplane the API offers backplaneList which returns multiple instances of backplane. | + | Nested elements: \\ |
- | === Baseboard === | + | ^ Element ^ Description ^ Unit ^ Data type ^ |
+ | |'' | ||
+ | |'' | ||
+ | |'' | ||
+ | |'' | ||
+ | |'' | ||
- | Example XML: | ||
- | <code xml>< | + | === Node === |
- | lastSensorUpdate=" | + | |
- | < | + | |
- | < | + | |
- | < | + | |
- | < | + | |
- | < | + | |
- | < | + | |
- | < | + | |
- | < | + | |
- | </ | + | |
- | The attributes have the following meaning: \\ | + | Request: |
+ | <code bash> | ||
+ | curl -X GET -k -i https:// | ||
+ | </ | ||
+ | |||
+ | Response: | ||
+ | <code xml> | ||
+ | <node baseboardPosition=" | ||
+ | < | ||
+ | < | ||
+ | < | ||
+ | <sensor name=" | ||
+ | <sensor name=" | ||
+ | </ | ||
+ | < | ||
+ | < | ||
+ | < | ||
+ | < | ||
+ | < | ||
+ | < | ||
+ | < | ||
+ | < | ||
+ | < | ||
+ | </ | ||
+ | < | ||
+ | <sensor name=" | ||
+ | <sensor name=" | ||
+ | </ | ||
+ | < | ||
+ | <sensor name=" | ||
+ | </ | ||
+ | </ | ||
+ | </ | ||
+ | |||
+ | Attributes: \\ | ||
^ Attribute ^ Description ^ Unit ^ Data type ^ | ^ Attribute ^ Description ^ Unit ^ Data type ^ | ||
- | |'' | + | |'' |
- | |'' | + | |'' |
- | |'' | + | |
- | |'' | + | |
|'' | |'' | ||
- | |'' | + | |'' |
- | |'' | + | |'' |
- | |'' | + | |'' |
+ | |'' | ||
+ | |'' | ||
+ | |'' | ||
+ | |'' | ||
- | In accordance to the component baseboard the API offers baseboardList which returns multiple instances of baseboard. | + | Nested elements: \\ |
- | === RCU === | + | ^ Element ^ Description ^ Unit ^ Data type ^ |
+ | |'' | ||
+ | |'' | ||
+ | |'' | ||
+ | |'' | ||
+ | |'' | ||
+ | |'' | ||
- | Example XML: | + | The API offers nodeList which returns a list of node IDs. |
- | < | + | Request: |
- | < | + | < |
- | < | + | curl -X GET -k -i https://host/REST/node |
- | < | + | </ |
- | < | + | |
- | < | + | |
- | < | + | |
- | < | + | |
- | </ | + | |
- | The attributes have the following meaning: \\ | + | Response: |
+ | <code xml> | ||
+ | < | ||
+ | < | ||
+ | < | ||
+ | < | ||
+ | </ | ||
+ | </ | ||
- | ^ Attribute ^ Description ^ Unit ^ Data type ^ | ||
- | |'' | ||
- | |'' | ||
- | |'' | ||
- | |'' | ||
- | |'' | ||
- | |'' | ||
- | |'' | ||
- | |'' | ||
- | |'' | ||
- | |'' | ||
- | |'' | ||
- | |'' | ||
- | |||
- | In accordance to the component rcu the API offers rcuList which returns multiple instances of rcu. | ||
- | === Rack === | + | === Fan === |
- | Example XML: | + | Request: |
+ | <code bash> | ||
+ | curl -X GET -k -i https:// | ||
+ | </ | ||
- | <code xml><rack description="Default rack" id="RCK_1"> | + | Response: |
- | < | + | <code xml> |
- | </ | + | <fan position="TRECS_1" |
+ | </ | ||
- | The attributes have the following meaning: \\ | + | Attributes: \\ |
^ Attribute ^ Description ^ Unit ^ Data type ^ | ^ Attribute ^ Description ^ Unit ^ Data type ^ | ||
- | |'' | + | |'' |
- | |'' | + | |'' |
- | |'' | + | |'' |
+ | |'' | ||
+ | |'' | ||
+ | |'' | ||
+ | |'' | ||
- | In accordance to the component rack the API offers | + | The API offers |
- | ==== Resources | + | Request: |
+ | <code bash> | ||
+ | curl -X GET -k -i https:// | ||
+ | </ | ||
+ | |||
+ | Response: | ||
+ | <code xml> | ||
+ | < | ||
+ | < | ||
+ | < | ||
+ | < | ||
+ | < | ||
+ | < | ||
+ | < | ||
+ | </ | ||
+ | </ | ||
+ | |||
+ | |||
+ | ==== Endpoints | ||
The resources are split into monitoring resources (for pure information gathering) and management resources (for changing the system configuration or state). | The resources are split into monitoring resources (for pure information gathering) and management resources (for changing the system configuration or state). | ||
Line 259: | Line 390: | ||
^ Attribute ^ Description ^ HTTP Method ^ | ^ Attribute ^ Description ^ HTTP Method ^ | ||
- | |''/ | + | |''/ |
- | |''/ | + | |''/ |
- | |''/ | + | |
|''/ | |''/ | ||
- | |''/ | + | |''/ |
- | |''/ | + | |''/ |
- | |''/ | + | |''/ |
- | |''/ | + | |''/ |
- | |''/ | + | |''/ |
- | |''/ | + | |
- | |''/ | + | |
- | |''/ | + | |
- | |''/ | + | |
- | |''/ | + | |
- | |''/ | + | |
=== Management === | === Management === | ||
Line 280: | Line 404: | ||
^ Attribute ^ Description ^ HTTP method ^ Parameter ^ | ^ Attribute ^ Description ^ HTTP method ^ Parameter ^ | ||
- | |''/ | + | |''/ |
- | |''/ | + | |''/ |
- | |''/ | + | |''/ |
- | |''/ | + | |''/ |
- | |''/ | + | |''/ |
- | |''/ | + | |''/ |
+ | |''/ | ||
+ | |''/ | ||
+ | |''/ | ||
+ | |''/ | ||
=== Errors === | === Errors === | ||
Line 293: | Line 421: | ||
===== Prometheus ===== | ===== Prometheus ===== | ||
- | A prometheus exporter is built-in and can be enabled. It is accessable at '' | + | A prometheus exporter is built-in and can be enabled. It is accessable at '' |
- | The big advantage of the Prometheus exporter compared to other APIs is that it dynamically exports its own metrics and thus, additional metrics can be added or removed during runtime after changing or hotplugging hardware. This allows to export only metrics of those microservers that are plugged in. As the RECS< | + | The big advantage of the Prometheus exporter compared to other APIs is that it dynamically exports its own metrics and thus, additional metrics can be added or removed during runtime after changing or hotplugging hardware. This allows to export only metrics of those microservers that are plugged in. As the RECS has a modular approach and every RECS can be equipped with different carrier blades and microserver configurations, |
==== Prometheus Configuration ==== | ==== Prometheus Configuration ==== |