Setting up DSE Spark application permissions

Authorize Spark application submissions, management, and use.

Manage user access to Spark applications. The CQL resources for Spark applications are WORKPOOL and SUBMISSION. Create permissions on the workpool resource controls the ability of a user to submit a Spark application to DSE. Modify permissions on submission resource controls the ability of a user to manage and remove applications.

Procedure

Use CQL shell (cqlsh) to authorize access to DSE Resource Manager and Spark applications. All commands must be entered on a DSE Analytics node in the cluster.
  • Access required for AlwaysOn SQL roles:
    GRANT ALL PERMISSIONS ON REMOTE OBJECT AlwaysOnSqlRoutingRPC to role_name;  
    GRANT ALL PERMISSIONS ON REMOTE OBJECT DseResourceManager TO role_name;
    GRANT ALL PERMISSIONS ON REMOTE OBJECT DseClientTool TO role_name;
    GRANT SELECT, MODIFY ON dse_analytics.alwayson_sql_cache_table TO role_name;
    GRANT SELECT, MODIFY ON dse_analytics.alwayson_sql_info TO role_name;
  • Access to only DSE Resource Manager:
    GRANT EXECUTE ON REMOTE OBJECT DseResourceManager TO role_name;
  • Run applications:
    GRANT EXECUTE ON REMOTE OBJECT DseClientTool TO role_name
    Note: Each DSE Analytics user must have permission to make remote procedure calls with DSE client tools.
  • For roles that are not superusers, access to the following tables is required:
    GRANT SELECT ON system.size_estimates TO role_name;
    GRANT SELECT, MODIFY ON "HiveMetaStore".sparkmetastore TO role_name;
    Additional permissions are required when running AlwaysOn SQL:
    GRANT SELECT, MODIFY ON dse_analytics.alwayson_sql_cache_table TO role_name;
  • Submit applications:
    • To all datacenters:
      GRANT CREATE ON ANY WORKPOOL TO role_name;
      Tip: Use revoke command to remove access:
      REVOKE CREATE ON ANY WORKPOOL FROM role_name;
    • A particular datacenter:
      GRANT CREATE ON WORKPOOL datacenter_name TO role_name;
      Tip: Use revoke command to remove access:
      REVOKE CREATE ON WORKPOOL datacenter_name FROM role_name;
    Note: The role used to submit an application is automatically granted permission to MODIFY the application.
  • Modify applications:
    • All applications:
      GRANT MODIFY ON ANY SUBMISSION TO role_name;
      Tip: Use revoke command to remove access:
      REVOKE MODIFY ON ANY SUBMISSION FROM role_name;
    • All applications in a particular datacenter:
      GRANT MODIFY ON ANY SUBMISSION IN WORKPOOL 'datacenter_name.*' TO role_name; 
      Tip: Use revoke command to remove access:
      REVOKE MODIFY ON ANY SUBMISSION IN WORKPOOL 'datacenter_name.*' FROM role_name;
      Note: You must specify a workpool name or wildcard when specifying a datacenter. In DSE versions prior to 6.0, you could specify the datacenter name only, but omitting the workpool name or wildcard will result in a syntax error.
    • Specific application in a particular datacenter:
      GRANT MODIFY ON SUBMISSION id IN WORKPOOL 'datacenter_name.*' TO role_name; 
      Tip: Use revoke command to remove access:
      REVOKE MODIFY ON SUBMISSION id IN WORKPOOL 'datacenter_name.*' FROM role_name;

      For example, to revoke the MODIFY permissions of an application started by a role named sparkrole, first find the application ID:

      LIST ALL PERMISSIONS OF sparkrole;
      role      | username  | resource                                                               | permission | granted | restricted | grantable
      ----------+-----------+------------------------------------------------------------------------+------------+---------+------------+-----------
      ...
      sparkrole | sparkrole | <submission app-20190519161729-0004 in work pool default in Analytics> |     MODIFY |    True |      False |     False
      sparkrole | sparkrole | <submission app-20190519161729-0004 in work pool default in Analytics> |  AUTHORIZE |    True |      False |     False
      sparkrole | sparkrole | <submission app-20190519161729-0004 in work pool default in Analytics> |   DESCRIBE |    True |      False |     False

      In the resource column the application ID follows submission. Here, the application ID is app-20190519161729-0004.

      Use the REVOKE command, specifying the role name, datacenter, and workpool listed in the resource column in the previous command.

      REVOKE MODIFY ON SUBMISSION 'app-20190519161729-0004' IN WORKPOOL 'Analytics.default' FROM sparkrole;

      To revoke all permissions from the application, use REVOKE ALL:

      REVOKE ALL ON SUBMISSION 'app-20190519161729-0004' IN WORKPOOL 'Analytics.default' FROM sparkrole;
      Note: You must specify a workpool name or wildcard when specifying a datacenter. In DSE versions prior to 6.0, you could specify the datacenter name only, but omitting the workpool name or wildcard will result in a syntax error.
  • Use DSE GraphFrames:
    GRANT EXECUTE ON REMOTE OBJECT DseGraphRpc TO role_name;

Example

Create role for AlwaysOn SQL (alwayson_sql):
CREATE ROLE alwayson_sql WITH LOGIN=true; // role name matches  

// Required if  true
GRANT EXECUTE ON ALL AUTHENTICATION SCHEMES TO alwayson_sql; 

// Spark RPC settings
GRANT ALL PERMISSIONS ON REMOTE OBJECT DseResourceManager TO alwayson_sql;
GRANT ALL PERMISSIONS ON REMOTE OBJECT DseClientTool TO alwayson_sql;
GRANT ALL PERMISSIONS ON REMOTE OBJECT AlwaysOnSqlRoutingRPC to alwayson_sql;
GRANT ALL PERMISSIONS ON REMOTE OBJECT AlwaysOnSqlNonRoutingRPC to alwayson_sql;
  
// Spark and DSE required table access
GRANT SELECT ON system.size_estimates TO alwayson_sql;
GRANT SELECT, MODIFY ON "HiveMetaStore".sparkmetastore TO alwayson_sql;
GRANT SELECT, MODIFY ON dse_analytics.alwayson_sql_cache_table TO alwayson_sql; 
GRANT SELECT, MODIFY ON dse_analytics.alwayson_sql_info TO alwayson_sql;

// Permissions to create and change applications  
GRANT CREATE, DESCRIBE ON ANY WORKPOOL TO alwayson_sql;
GRANT MODIFY, DESCRIBE ON ANY SUBMISSION TO alwayson_sql;