meta data for this page
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
doc_trecs:software_interface [2022/04/26 13:21] – vor | doc_trecs:software_interface [2023/11/28 08:54] (current) – [Components] bil | ||
---|---|---|---|
Line 1: | Line 1: | ||
====== Software interface ====== | ====== Software interface ====== | ||
- | There are several software interfaces available to monitor the status of the t.RECS< | + | There are several software interfaces available to monitor the status of the t.RECS< |
===== Management WebGUI ===== | ===== Management WebGUI ===== | ||
Line 12: | Line 12: | ||
|{{ : | |{{ : | ||
|{{ : | |{{ : | ||
+ | |||
+ | On the left side is a menu, which can be toggled by clicking the menu button in the upper left corner of the screen. The menu contains the following items: | ||
+ | |||
+ | [[doc_trecs: | ||
+ | [[doc_trecs: | ||
+ | [[doc_trecs: | ||
+ | [[doc_trecs: | ||
+ | [[doc_trecs: | ||
+ | [[doc_trecs: | ||
+ | [[doc_trecs: | ||
+ | [[doc_trecs: | ||
+ | [[doc_trecs: | ||
==== Dashboard ==== | ==== Dashboard ==== | ||
- | Figure 1 shows the Dashboard | + | The Dashboard is seen first when opening the Web****GUI and displays |
- | [[documentation: | + | < |
- | [[documentation:software_interface# | + | {{:doc_trecs:t.recs_dashboard.png? |
- | [[documentation: | + | |
- | [[documentation: | + | |
- | The second colum contains the buttons and sliders to manipulate the system. While the third colum is mostly for history information like power usage and temperature graphs. | + | ==== Management ==== |
- | ==== Overview ==== | + | In this view, nodes can be turned on or off with a quick menu, which opens when clicking on the gear symbol of a node.\\ |
+ | Multiple nodes can be controled at once via the panel " | ||
+ | The view also shows fan monitoring data and allows a detailed look at the temperature map of the system' | ||
+ | By clicking on a node label, the respective [[doc_trecs: | ||
+ | Furthermore the view displays the summarized system health status. | ||
- | All units that are installed in the rack and that are managed by the software are summarized on this page. | + | < |
- | The total power usage is summed up over all managed units. | + | {{: |
- | < | ||
- | {{: | ||
- | ==== Management | + | === Node Management === |
- | An overview | + | This view features controlling the power state of the selected |
+ | It is also possible to change KVM settings or open a console | ||
+ | If the node is running and the [[documentation: | ||
- | < | + | < |
+ | {{:doc_trecs:t.recs_node-management.png?direct|Node Management View}}</ | ||
- | A quick menu to control a node can be opened by klicking on the gear next to an CXP node. In this menu the node can be switched on and off and the KVM can be switched to the node. | ||
- | < | + | === Network === |
- | <WRAP round tip> | + | The network view allows changing the settings of the managment port. This port is used to access the webinterface and all APIs.\\ |
- | Apalis nodes do not show a management pop-up button due to size constraints.\\ Click on the node button while pressing the " | + | In addition to that, VLANs of the node network can be configured and assigned |
- | </ | + | |
- | <WRAP round tip> | + | <imgcaption Network View|> |
- | When pressing the " | + | {{: |
- | </WRAP> | + | |
- | === Node management === | ||
- | On this page the selected node can be controlled and detailed status values and graphs can be seen.\\ | + | === Composition === |
- | By klicking on the arrow, pointing downwards in the upper bar next to the nodename, the other nodes of the unit can be chosen. | + | |
- | < | + | This view allows the configuration of the PCIe resources in the form of composed nodes.\\ |
+ | A composed | ||
+ | A wizard leads through the process of creating such composed nodes. | ||
- | ==== Global settings ==== | + | < |
+ | {{: | ||
- | All IP< | ||
- | The firmware for the whole RECS< | ||
- | For the update process **all modules will be powered off!**\\ | ||
- | < | + | === Users === |
- | ==== Log viewer ==== | + | This view features the user management. Users can be created, edited or deleted.\\ |
+ | Additionally, | ||
- | In the system healths tab of the log page the status changes of the sensors, fan and boards can be seen. | + | < |
+ | {{: | ||
- | < | ||
- | In the java tab of the log page all messages regarding the software can be found. | + | === Settings === |
- | < | + | This view allows changing system-wide preferences (e.g. regarding the interfaces of the system). |
+ | < | ||
+ | {{: | ||
+ | |||
+ | |||
+ | === Time === | ||
+ | |||
+ | Here, the system time can be set either manually or using NTP. | ||
+ | |||
+ | < | ||
+ | {{: | ||
+ | |||
+ | |||
+ | === Firmware === | ||
+ | |||
+ | This view shows the currently installed versions of the firmware and management software.\\ | ||
+ | Furthermore, | ||
+ | |||
+ | < | ||
+ | {{: | ||
+ | |||
+ | |||
+ | === Logs === | ||
+ | |||
+ | In the System Events tab of this view, the status changes of the sensors, fan and boards can be seen.\\ | ||
+ | In the Java Messages tab , all messages regarding the software can be found.\\ | ||
Several filters can be set for both tabs at the top.\\ | Several filters can be set for both tabs at the top.\\ | ||
- | At the bottom the whole log can be downloaded as a ZIP file containing the individual logfiles. | + | The whole log can be downloaded as a ZIP file containing the individual logfiles. |
+ | |||
+ | < | ||
+ | {{: | ||
===== Redfish API ===== | ===== Redfish API ===== | ||
- | The documentation of the RECS< | + | The management software also features a Redfish API.\\ |
+ | The documentation | ||
===== REST API ===== | ===== REST API ===== | ||
Line 86: | Line 127: | ||
==== Access ==== | ==== Access ==== | ||
- | The RECS< | + | The REST API is accessible via the management |
Accessing the REST API requires HTTP Basic authentication. The authenticated user has to be in the " | Accessing the REST API requires HTTP Basic authentication. The authenticated user has to be in the " | ||
Line 92: | Line 133: | ||
==== Components ==== | ==== Components ==== | ||
- | The RECS< | + | The REST API makes all hardware components in the cluster available as XML trees in software. The following components are supported by the API: \\ |
^ Attribute ^ Description ^ | ^ Attribute ^ Description ^ | ||
- | |'' | + | |'' |
- | |'' | + | |
|'' | |'' | ||
- | |'' | + | |'' |
- | |'' | + | |
- | Many resources also return lists of components. These are named according to the scheme < | + | === RCU === |
- | Example | + | The main entrypoint |
- | + | ||
- | <code xml>< | + | |
- | < | + | |
- | lastSensorUpdate=" | + | |
- | < | + | |
- | < | + | |
- | < | + | |
- | < | + | |
- | < | + | |
- | </ | + | |
- | </ | + | |
- | + | ||
- | === Node === | + | |
- | Example XML: | + | Request: |
+ | <code bash> | ||
+ | curl -X GET -k -i https:// | ||
+ | </ | ||
- | <code xml><node baseboardPosition="0" | + | Response: |
- | actualPEGPowerUsage="15.12053962324833" | + | <code xml> |
- | baseboardId="RCU_84055620466592_BB_1" health=" | + | <rcu name="RCUMaster (192.168.XX.YY)" |
- | lastSensorUpdate="1465470151268" | + | < |
- | highestTemperature="20.0" | + | <sensor name="Node average temperature" |
+ | <sensor name="Node highest temperature" | ||
+ | < | ||
+ | </ | ||
+ | < | ||
+ | < | ||
+ | < | ||
+ | < | ||
+ | < | ||
+ | < | ||
+ | < | ||
+ | < | ||
+ | < | ||
+ | < | ||
+ | < | ||
+ | <sensor name="RCU total power usage" unit="W" | ||
+ | < | ||
+ | <sensor name="RCU power usage (Node)" | ||
+ | </ | ||
+ | </ | ||
+ | </ | ||
- | The following table shows the possible attributes (some are optional) and their meaning: \\ | + | Attributes: \\ |
^ Attribute ^ Description ^ Unit ^ Data type ^ | ^ Attribute ^ Description ^ Unit ^ Data type ^ | ||
- | |'' | + | |'' |
- | |'' | + | |'' |
- | |'' | + | |'' |
- | |'' | + | |'' |
- | |'' | + | |'' |
- | |'' | + | |'' |
- | |'' | + | |
- | |'' | + | |
- | |'' | + | |
- | |'' | + | |
- | |'' | + | |
- | |'' | + | |
- | |'' | + | |
- | |'' | + | |
|'' | |'' | ||
- | |'' | + | |'' |
- | |'' | + | |'' |
- | In accordance to the component node the API offers nodeList which returns multiple instances of node. | + | Nested elements: \\ |
- | === Backplane | + | ^ Element ^ Description ^ Unit ^ Data type ^ |
+ | |'' | ||
+ | |'' | ||
+ | |'' | ||
+ | |'' | ||
+ | |'' | ||
+ | |||
+ | |||
+ | === Baseboard | ||
- | Example XML: | + | Request: |
+ | <code bash> | ||
+ | curl -X GET -k -i https:// | ||
+ | </ | ||
- | <code xml><backplane position=" | + | Response: |
- | lastSensorUpdate="1465470151268"> | + | <code xml> |
- | <temperatures>24.0</temperatures> | + | <baseboard type=" |
- | <temperatures> | + | < |
- | <temperatures>26.0</temperatures> | + | < |
- | <temperatures>27.0</temperatures> | + | < |
- | <temperatures>28.0</temperatures> | + | < |
- | </backplane></ | + | < |
+ | < | ||
+ | < | ||
+ | < | ||
+ | < | ||
+ | < | ||
+ | <sensor name="Baseboard infrastructure power" unit=" | ||
+ | <sensor name="Baseboard power usage (Node + PEG)" unit=" | ||
+ | < | ||
+ | </power> | ||
+ | < | ||
+ | <sensor name=" | ||
+ | <sensor name=" | ||
+ | <sensor name=" | ||
+ | <sensor name=" | ||
+ | <sensor name=" | ||
+ | <sensor name=" | ||
+ | <sensor name=" | ||
+ | <sensor name=" | ||
+ | <sensor name=" | ||
+ | </temperature> | ||
+ | < | ||
+ | <sensor name=" | ||
+ | <sensor name=" | ||
+ | <sensor name=" | ||
+ | </ | ||
+ | </ | ||
+ | </ | ||
- | The attributes have the following meaning: \\ | + | Attributes: \\ |
^ Attribute ^ Description ^ Unit ^ Data type ^ | ^ Attribute ^ Description ^ Unit ^ Data type ^ | ||
- | |'' | + | |'' |
- | |'' | + | |'' |
- | |'' | + | |'' |
|'' | |'' | ||
- | |'' | + | |'' |
+ | |'' | ||
- | In accordance to the component backplane the API offers backplaneList which returns multiple instances of backplane. | + | Nested elements: \\ |
- | === Baseboard === | + | ^ Element ^ Description ^ Unit ^ Data type ^ |
+ | |'' | ||
+ | |'' | ||
+ | |'' | ||
+ | |'' | ||
+ | |'' | ||
- | Example XML: | ||
- | <code xml>< | + | === Node === |
- | lastSensorUpdate=" | + | |
- | < | + | |
- | < | + | |
- | < | + | |
- | < | + | |
- | < | + | |
- | < | + | |
- | < | + | |
- | < | + | |
- | </ | + | |
- | The attributes have the following meaning: \\ | + | Request: |
+ | <code bash> | ||
+ | curl -X GET -k -i https:// | ||
+ | </ | ||
+ | |||
+ | Response: | ||
+ | <code xml> | ||
+ | <node baseboardPosition=" | ||
+ | < | ||
+ | < | ||
+ | < | ||
+ | <sensor name=" | ||
+ | <sensor name=" | ||
+ | </ | ||
+ | < | ||
+ | < | ||
+ | <sensor name=" | ||
+ | <sensor name=" | ||
+ | </ | ||
+ | < | ||
+ | <sensor name=" | ||
+ | </ | ||
+ | </ | ||
+ | </ | ||
+ | |||
+ | Attributes: \\ | ||
^ Attribute ^ Description ^ Unit ^ Data type ^ | ^ Attribute ^ Description ^ Unit ^ Data type ^ | ||
- | |'' | + | |'' |
- | |'' | + | |'' |
- | |'' | + | |
- | |'' | + | |
|'' | |'' | ||
- | |'' | + | |'' |
- | |'' | + | |'' |
- | |'' | + | |'' |
+ | |'' | ||
+ | |'' | ||
+ | |'' | ||
+ | |'' | ||
- | In accordance to the component baseboard the API offers baseboardList which returns multiple instances of baseboard. | + | Nested elements: \\ |
- | === RCU === | + | ^ Element ^ Description ^ Unit ^ Data type ^ |
+ | |'' | ||
+ | |'' | ||
+ | |'' | ||
+ | |'' | ||
+ | |'' | ||
+ | |'' | ||
- | Example XML: | + | The API offers nodeList, which returns a list of the IDs of all nodes within the system. |
- | < | + | Request: |
- | < | + | < |
- | < | + | curl -X GET -k -i https://host/REST/node |
- | < | + | </ |
- | < | + | |
- | < | + | |
- | < | + | |
- | < | + | |
- | </ | + | |
- | The attributes have the following meaning: \\ | + | Response: |
+ | <code xml> | ||
+ | < | ||
+ | < | ||
+ | < | ||
+ | < | ||
+ | </ | ||
+ | </ | ||
- | ^ Attribute ^ Description ^ Unit ^ Data type ^ | ||
- | |'' | ||
- | |'' | ||
- | |'' | ||
- | |'' | ||
- | |'' | ||
- | |'' | ||
- | |'' | ||
- | |'' | ||
- | |'' | ||
- | |'' | ||
- | |'' | ||
- | |'' | ||
- | |||
- | In accordance to the component rcu the API offers rcuList which returns multiple instances of rcu. | ||
- | === Rack === | + | === Fan === |
- | Example XML: | + | Request: |
+ | <code bash> | ||
+ | curl -X GET -k -i https:// | ||
+ | </ | ||
- | <code xml><rack description="Default rack" id="RCK_1"> | + | Response: |
- | < | + | <code xml> |
- | </ | + | <fan position="TRECS_1" |
+ | </ | ||
- | The attributes have the following meaning: \\ | + | Attributes: \\ |
^ Attribute ^ Description ^ Unit ^ Data type ^ | ^ Attribute ^ Description ^ Unit ^ Data type ^ | ||
- | |'' | + | |'' |
- | |'' | + | |'' |
- | |'' | + | |'' |
+ | |'' | ||
+ | |'' | ||
+ | |'' | ||
+ | |'' | ||
- | In accordance to the component rack the API offers | + | The API offers |
- | ==== Resources | + | Request: |
+ | <code bash> | ||
+ | curl -X GET -k -i https:// | ||
+ | </ | ||
+ | |||
+ | Response: | ||
+ | <code xml> | ||
+ | < | ||
+ | < | ||
+ | < | ||
+ | < | ||
+ | < | ||
+ | < | ||
+ | < | ||
+ | </ | ||
+ | </ | ||
+ | |||
+ | |||
+ | ==== Endpoints | ||
The resources are split into monitoring resources (for pure information gathering) and management resources (for changing the system configuration or state). | The resources are split into monitoring resources (for pure information gathering) and management resources (for changing the system configuration or state). | ||
Line 261: | Line 381: | ||
^ Attribute ^ Description ^ HTTP Method ^ | ^ Attribute ^ Description ^ HTTP Method ^ | ||
- | |''/ | + | |''/ |
- | |''/ | + | |''/ |
- | |''/ | + | |
|''/ | |''/ | ||
- | |''/ | + | |''/ |
- | |''/ | + | |''/ |
- | |''/ | + | |''/ |
- | |''/ | + | |''/ |
- | |''/ | + | |''/ |
- | |''/ | + | |
- | |''/ | + | |
- | |''/ | + | |
- | |''/ | + | |
- | |''/ | + | |
- | |''/ | + | |
=== Management === | === Management === | ||
Line 282: | Line 395: | ||
^ Attribute ^ Description ^ HTTP method ^ Parameter ^ | ^ Attribute ^ Description ^ HTTP method ^ Parameter ^ | ||
- | |''/ | + | |''/ |
- | |''/ | + | |''/ |
- | |''/ | + | |''/ |
- | |''/ | + | |''/ |
- | |''/ | + | |''/ |
- | |''/ | + | |''/ |
+ | |''/ | ||
+ | |''/ | ||
+ | |''/ | ||
+ | |''/ | ||
=== Errors === | === Errors === | ||
Line 295: | Line 412: | ||
===== Prometheus ===== | ===== Prometheus ===== | ||
- | A prometheus exporter is built-in and can be enabled. It is accessable at '' | + | A prometheus exporter is built-in and can be enabled. It is accessable at '' |
- | The big advantage of the Prometheus exporter compared to other APIs is that it dynamically exports its own metrics and thus, additional metrics can be added or removed during runtime after changing or hotplugging hardware. This allows to export only metrics of those microservers that are plugged in. As the RECS< | + | The big advantage of the Prometheus exporter compared to other APIs is that it dynamically exports its own metrics and thus, additional metrics can be added or removed during runtime after changing or hotplugging hardware. This allows to export only metrics of those microservers that are plugged in. As the RECS has a modular approach and every RECS can be equipped with different carrier blades and microserver configurations, |
==== Prometheus Configuration ==== | ==== Prometheus Configuration ==== |