Skip to main content

s3-bucket-doc

Amazon S3 (Simple Storage Service) is widely used in our infrastructure for storing a variety of data types—ranging from static web assets and backups to logs, analytics data, user uploads, and integration files. As our system grows and more teams interact with S3, it becomes increasingly important to maintain clear documentation and ownership for every bucket.

This document is intended to serve as a single source of truth for all S3 buckets in use. It provides:

  1. Clear Purpose Definitions
    Each S3 bucket should have a well-defined purpose. This helps avoid misuse, redundant storage, and unstructured data sprawl. For instance, backups should be separated from user uploads, and integration files should not be mixed with analytics datasets.

  2. Access Control Guidelines
    To minimize risk and uphold security best practices, each bucket should follow the principle of least privilege access. Defining usage also helps in assigning the right IAM roles, bucket policies, and access controls based on the data's sensitivity, environment (e.g., dev, staging, production), and team responsibilities.

  3. Monitoring and Cost Management
    Without proper structure, S3 costs can spiral due to orphaned objects, unnecessary data retention, and lack of cleanup processes. Documenting each bucket's usage enables:

    • Proactive monitoring (e.g., via CloudWatch or Athena)
    • Lifecycle management (e.g., archiving old backups or auto-deleting temp files)
    • Budget tracking and alerts using tags and AWS Cost Explorer
  4. Operational Transparency & Collaboration
    Teams across engineering, DevOps, QA, data, and product need to interact with S3. This documentation ensures everyone understands:

    • What each bucket is for
    • Who owns or manages the bucket
    • How long data should be retained
    • Whether the bucket is public or private
    • How and when the data is used or cleaned
  5. Compliance and Data Governance
    For teams working in regulated environments or handling user data, proper classification and documentation of S3 buckets are critical for compliance (e.g., GDPR, ISO, SOC 2). This also helps in regular audits and access reviews.

BucketList

Purpose:
Used by the Channel Manager (CM) system for storing files specific to CM functionalities, integrations, and partner-related assets.

Detailed Description:
This bucket holds files necessary for managing channel manager operations such as rate and availability updates, OTA mappings, logs, and partner-specific configuration files. It may include XML templates, credentials (if encrypted), or PMS integration snapshots.

Monitoring & Governance:

  • Access Control: Restrict by team and integration partner. Use prefix-based IAM permissions.
  • Data Retention: N/A.
  • Monitoring: Enable object-level logging and CloudWatch alerts for failures in integration workflows.

Purpose:
Stores static assets and resources for Webku.

Detailed Description:
This includes frontend assets like HTML templates, JavaScript, CSS files, media (images/videos), version, and site-specific branding. It's often used to support static delivery via CloudFront.

Monitoring & Governance:

  • Access Control: Read-only access for frontends; write access only for CI/CD or web admin tools.
  • Versioning: Enable versioning for uploaded assets.
  • Lifecycle Policy: Review logos and favicons annually for unused files.
  • Monitoring: Use CloudFront/S3 access logs to detect unused or outdated assets.

Purpose:
A shared S3 bucket for general-purpose files not specific to any subsystem, used across various services and teams.

Detailed Description:
Contains miscellaneous files such as documentation, temporary exports/imports, reports, and common configurations shared across teams. This bucket acts as a catch-all and must be tightly governed to avoid becoming disorganized.

Structure Breakdown:

  • /assets/ : Holds shared media assets such as logos and favicons of BNL and tripla product.
  • `/paymenthub
    • /tos : Stores Terms of Service PDFs or HTML files for use in the Payment Hub.
      • Data Retention : daily
  • website-bnl : stored all assets for website bookandlink
  • website-payku : stored all assets for payku website

Monitoring & Governance:

  • Access Control: -
  • Structure: Define a folder policy by environtment (e.g., /qa/, /staging/, /Production/).
  • Data Retention: -
  • Monitoring: Track upload frequency by prefix to detect misuse or abandoned folders.